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The fundamental problem of chemical physiology and of 
embryology is to understand why tissue cells do not all 
express, all the time, all the potentialities inherent in their 
genome. 

-Frangois Jacob and Jacques Monod, 
article in Journal of Molecular Biology, 1961 


O f the 4,000 or so genes in the typical bacterial 
genome, or the perhaps 35,000 genes in the human 
genome, only a fraction are expressed in a cell at any 
given time. Some gene products are present in very large 
amounts: the elongation factors required for protein 
synthesis, for example, are among the most abundant 
proteins in bacteria, and ribulose 1,5-bisphosphate 
carboxylase/oxygenase (rubisco) of plants and photosyn¬ 
thetic bacteria is, as far as we know, the most abundant 
enzyme in the biosphere. Other gene products occur in 
much smaller amounts; for instance, a cell may contain 
only a few molecules of the enzymes that repair rare 
DNA lesions. Requirements for some gene products 
change over time. The need for enzymes in certain meta¬ 
bolic pathways may wax and wane as food sources 
change or are depleted. During development of a mul¬ 
ticellular organism, some proteins that influence cellu¬ 
lar differentiation are present for just a brief time in only 
a few cells. Specialization of cellular function can dra¬ 
matically affect the need for various gene products; an 
example is the uniquely high concentration of a single 


protein—hemoglobin—in erythrocytes. Given the high 
cost of protein synthesis, regulation of gene expression 
is essential to making optimal use of available energy. 

The cellular concentration of a protein is deter¬ 
mined by a delicate balance of at least seven processes, 
each having several potential points of regulation: 

1 . Synthesis of the primary RNA transcript 
(transcription) 

2. Posttranscriptional modification of mRNA 

3. Messenger RNA degradation 

4. Protein synthesis (translation) 

5. Posttranslational modification of proteins 

6. Protein targeting and transport 

7. Protein degradation 

These processes are summarized in Figure 28-1. We 
have examined several of these mechanisms in previous 
chapters. Posttranscriptional modification of mRNA, by 
processes such as alternative splicing patterns (see 
Fig. 26-19b) or RNA editing (see Box 27-1), can affect 
which proteins are produced from an mRNA transcript 
and in what amounts. A variety of nucleotide sequences 
in an mRNA can affect the rate of its degradation (p. 
1020). Many factors affect the rate at which an mRNA 
is translated into a protein, as well as the posttransla¬ 
tional modification, targeting, and eventual degradation 
of that protein (Chapter 27). 

This chapter focuses primarily on the regulation of 
transcription initiation, although aspects of posttran¬ 
scriptional and translational regulation are also de¬ 
scribed. Of the regulatory processes illustrated in Fig¬ 
ure 28-1, those operating at the level of transcription 
initiation are the best documented and probably the most 
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FIGURE 28-1 Seven processes that affect the steady-state concen¬ 
tration of a protein. Each process has several potential points of 
regulation. 


common. As in all biochemical processes, an efficient 
place for regulation is at the beginning of the pathway. 
Because synthesis of informational macromolecules is 
so extraordinarily expensive in terms of energy, elabo¬ 
rate mechanisms have evolved to regulate the process. 
Researchers continue to discover complex and some¬ 
times surprising regulatory mechanisms. Increasingly, 
posttranscriptional and translational regulation are 
proving to be among the more important of these 
processes, especially in eukaryotes. In fact, the regula¬ 
tory processes themselves can involve a considerable in¬ 
vestment of chemical energy. 

Control of transcription initiation permits the syn¬ 
chronized regulation of multiple genes encoding prod¬ 
ucts with interdependent activities. For example, when 
their DNA is heavily damaged, bacterial cells require a 
coordinated increase in the levels of the many DNA re¬ 
pair enzymes. And perhaps the most sophisticated form 


of coordination occurs in the complex regulatory circuits 
that guide the development of multicellular eukaryotes, 
which can involve many types of regulatory mechanisms. 

We begin by examining the interactions between 
proteins and DNA that are the key to transcriptional reg¬ 
ulation. We next discuss the specific proteins that in¬ 
fluence the expression of specific genes, first in prokary¬ 
otic and then in eukaryotic cells. Information about 
posttranscriptional and translational regulation is in¬ 
cluded in the discussion, where relevant, to provide a 
more complete overview of the rich complexity of reg¬ 
ulatory mechanisms. 


28.1 Principles of Gene Regulation 

Genes for products that are required at all times, such 
as those for the enzymes of central metabolic path¬ 
ways, are expressed at a more or less constant level in 
virtually every cell of a species or organism. Such genes 
are often referred to as housekeeping genes. Un¬ 
varying expression of a gene is called constitutive 
gene expression. 

For other gene products, cellular levels rise and fall 
in response to molecular signals; this is regulated gene 
expression. Gene products that increase in concen¬ 
tration under particular molecular circumstances are re¬ 
ferred to as inducible; the process of increasing their 
expression is induction. The expression of many of the 
genes encoding DNA repair enzymes, for example, is in¬ 
duced by high levels of DNA damage. Conversely, gene 
products that decrease in concentration in response to 
a molecular signal are referred to as repressible, and 
the process is called repression. For example, in bac¬ 
teria, ample supplies of tryptophan lead to repression 
of the genes for the enzymes that catalyze tryptophan 
biosynthesis. 

Transcription is mediated and regulated by protein- 
DNA interactions, especially those involving the protein 
components of RNA polymerase (Chapter 26). We first 
consider how the activity of RNA polymerase is regu¬ 
lated, and proceed to a general description of the pro¬ 
teins participating in this process. We then examine the 
molecular basis for the recognition of specific DNA se¬ 
quences by DNA-binding proteins. 

RNA Polymerase Binds to DNA at Promoters 

RNA polymerases bind to DNA and initiate transcrip¬ 
tion at promoters (see Fig. 26-5), sites generally found 
near points at which RNA synthesis begins on the DNA 
template. The regulation of transcription initiation of¬ 
ten entails changes in how RNA polymerase interacts 
with a promoter. 

The nucleotide sequences of promoters vary consid¬ 
erably, affecting the binding affinity of RNA polymerases 
and thus the frequency of transcription initiation. Some 
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FIGURE 28-2 Consensus sequence for many E. coli promoters. Most 
base substitutions in the —10 and —35 regions have a negative effect 
on promoter function. Some promoters also include the UP (upstream 
promoter) element (see Fig. 26-5). By convention, DNA sequences 


are shown as they exist in the nontemplate strand, with the 5' termi¬ 
nus on the left. Nucleotides are numbered from the transcription start 
site, with positive numbers to the right (in the direction of transcrip¬ 
tion) and negative numbers to the left. N indicates any nucleotide. 


Escherichia coli genes are transcribed once per second, 
others less than once per cell generation. Much of this 
variation is due to differences in promoter sequence. In 
the absence of regulatory proteins, differences in pro¬ 
moter sequences may affect the frequency of transcrip¬ 
tion initiation by a factor of 1,000 or more. Most E. coli 
promoters have a sequence close to a consensus (Fig- 
28-2). Mutations that result in a shift away from the con¬ 
sensus sequence usually decrease promoter function; 
conversely, mutations toward consensus usually enhance 
promoter function. 

Although housekeeping genes are expressed con- 
stitutively, the cellular concentrations of the proteins 
they encode vary widely. For these genes, the RNA 
polymerase-promoter interaction strongly influences 
the rate of transcription initiation; differences in pro¬ 
moter sequence allow the cell to synthesize the appro¬ 
priate level of each housekeeping gene product. 

The basal rate of transcription initiation at the pro¬ 
moters of nonhousekeeping genes is also determined by 
the promoter sequence, but expression of these genes 
is further modulated by regulatory proteins. Many of 
these proteins work by enhancing or interfering with the 
interaction between RNA polymerase and the promoter. 

The sequences of eukaryotic promoters are more 
variable than their prokaryotic counterparts (see 
Fig. 26-8). The three eukaryotic RNA polymerases usu¬ 
ally require an array of general transcription factors in 
order to bind to a promoter. Yet, as with prokaryotic 
gene expression, the basal level of transcription is de¬ 
termined by the effect of promoter sequences on the 
function of RNA polymerase and its associated tran¬ 
scription factors. 


Transcription Initiation Is Regulated by Proteins That 
Bind to or near Promoters 

At least three types of proteins regulate transcription 
initiation by RNA polymerase: specificity factors alter 
the specificity of RNA polymerase for a given promoter 
or set of promoters; repressors impede access of RNA 
polymerase to the promoter; and activators enhance 
the RNA polymerase-promoter interaction. 

We introduced prokaryotic specificity factors in 
Chapter 26 (see Fig. 26-5), although we did not refer to 
them by that name. The a subunit of the E. coli RNA 
polymerase holoenzyme is a specificity factor that medi¬ 
ates promoter recognition and binding. Most E. coli pro¬ 
moters are recognized by a single a subunit (M r 70,000), 
< t 70 . Under some conditions, some of the cr' 0 subunits are 
replaced by another specificity factor. One notable case 
arises when the bacteria are subjected to heat stress, 
leading to the replacement of cr'° by cr 32 (M r 32,000). 
When bound to a 32 , RNA polymerase is directed to a spe¬ 
cialized set of promoters with a different consensus 
sequence (Fig. 28-3). These promoters control the ex¬ 
pression of a set of genes that encode the heat-shock 
response proteins. Thus, through changes in the binding 
affinity of the polymerase that direct it to different pro¬ 
moters, a set of genes involved in related processes is co- 
ordinately regulated. In eukaryotic cells, some of the gen¬ 
eral transcription factors, in particular the TATA-binding 
protein (TBP; see Fig. 26-8), may be considered speci¬ 
ficity factors. 

Repressors bind to specific sites on the DNA. In 
prokaryotic cells, such binding sites, called operators, 
are generally near a promoter. RNA polymerase binding, 
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FIGURE 28-3 Consensus sequence for promoters that regulate expression of the E. coli heat- 
shock genes. This system responds to temperature increases as well as some other environmental 
stresses, resulting in the induction of a set of proteins. Binding of RNA polymerase to heat-shock 
promoters is mediated by a specialized cr subunit of the polymerase, a 32 , which replaces cr 70 in 
the RNA polymerase initiation complex. 
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or its movement along the DNA after binding, is blocked 
when the repressor is present. Regulation by means of 
a repressor protein that blocks transcription is referred 
to as negative regulation. Repressor binding to DNA 
is regulated by a molecular signal (or effector), usually 
a small molecule or a protein, that binds to the repres¬ 
sor and causes a conformational change. The interaction 
between repressor and signal molecule either increases 
or decreases transcription. In some cases, the confor¬ 
mational change results in dissociation of a DNA-bound 
repressor from the operator (Fig. 28—4a). Transcription 
initiation can then proceed unhindered. In other cases, 
interaction between an inactive repressor and the signal 
molecule causes the repressor to bind to the operator 
(Fig. 28-4b). In eukaryotic cells, the binding site for a 
repressor may be some distance from the promoter; 
binding has the same effect as in bacterial cells: inhibit¬ 


ing the assembly or activity of a transcription complex 
at the promoter. 

Activators provide a molecular counterpoint to re¬ 
pressors; they bind to DNA and enhance the activity of 
RNA polymerase at a promoter; this is positive regu¬ 
lation. Activator binding sites are often adjacent to 
promoters that are bound weakly or not at all by RNA 
polymerase alone, such that little transcription occurs 
in the absence of the activator. Some eukaryotic acti¬ 
vators bind to DNA sites, called enhancers, that are 
quite distant from the promoter, affecting the rate of 
transcription at a promoter that may be located thou¬ 
sands of base pairs away. Some activators are normally 
bound to DNA, enhancing transcription until dissociation 
of the activator is triggered by the binding of a signal 
molecule (Fig. 28—4c). In other cases the activator binds 
to DNA only after interaction with a signal molecule 
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FIGURE 28-4 Common patterns of regulation of transcription initi¬ 
ation. Two types of negative regulation are illustrated, (a) Repressor 
(pink) binds to the operator in the absence of the molecular signal; 
the external signal causes dissociation of the repressor to permit tran¬ 
scription. (b) Repressor binds in the presence of the signal; the re¬ 
pressor dissociates and transcription ensues when the signal is re¬ 
moved. Positive regulation is mediated by gene activators. Again, two 
types are shown, (c) Activator (green) binds in the absence of the mo¬ 


lecular signal and transcription proceeds; when the signal is added, 
the activator dissociates and transcription is inhibited, (d) Activator 
binds in the presence of the signal; it dissociates only when the sig¬ 
nal is removed. Note that "positive" and "negative" regulation refer to 
the type of regulatory protein involved: the bound protein either fa¬ 
cilitates or inhibits transcription. In either case, addition of the mo¬ 
lecular signal may increase or decrease transcription, depending on 
its effect on the regulatory protein. 
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(Fig. 28-4d). Signal molecules can therefore increase or 
decrease transcription, depending on how they affect 
the activator. Positive regulation is particularly common 
in eukaryotes, as we shall see. 

Many Prokaryotic Genes Are Clustered and 
Regulated in Operons 

Bacteria have a simple general mechanism for coordi¬ 
nating the regulation of genes encoding products that 
participate in a set of related processes: these genes are 
clustered on the chromosome and are transcribed to¬ 
gether. Many prokaryotic mRNAs are polycistronic— 
multiple genes on a single transcript—and the single 
promoter that initiates transcription of the cluster is the 
site of regulation for expression of all the genes in the 
cluster. The gene cluster and promoter, plus additional 
sequences that function together in regulation, are 
called an operon (Fig. 28-5). Operons that include two 
to six genes transcribed as a unit are common; some 
operons contain 20 or more genes. 

Many of the principles of prokaryotic gene expres¬ 
sion were first defined by studies of lactose metabolism 
in E. coli, which can use lactose as its sole carbon source. 
In 1960, Franqois Jacob and Jacques Monod published 
a short paper in the Proceedings of the French Acad¬ 
emy of Sciences that described how two adjacent genes 
involved in lactose metabolism were coordinately regu¬ 
lated by a genetic element located at one end of the 
gene cluster. The genes were those for /3-galactosidase, 
which cleaves lactose to galactose and glucose, and 
galactoside permease, which transports lactose into the 
cell (Fig. 28-6). The terms “operon” and “operator” 
were first introduced in this paper. With the operon 
model, gene regulation could, for the first time, be con¬ 
sidered in molecular terms. 




Galactose Glucose 


The lac Operon Is Subject to Negative Regulation 

The lactose (lac) operon (Fig. 28-7a) includes the 
genes for /3-galactosidase ( Z ), galactoside permease 
(F), and thiogalactoside transacetylase (A). The last of 
these enzymes appears to modify toxic galactosides to 
facilitate their removal from the cell. Each of the three 
genes is preceded by a ribosome binding site (not shown 
in Fig. 28-7) that independently directs the translation 

Repressor 
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binding site (operator) 

DNA | | Promoter j/llll/ll)\ A | B | C 
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Regulatory sequences Genes transcribed as a unit 

FIGURE 28-5 Representative prokaryotic operon. Genes A, B, and 
C are transcribed on one polycistronic mRNA. Typical regulatory se¬ 
quences include binding sites for proteins that either activate or re¬ 
press transcription from the promoter. 


FIGURE 28-6 Lactose metabolism in E. coli. Uptake and metabolism 
of lactose require the activities of galactoside permease and fi- 
galactosidase. Conversion of lactose to allolactose by transglycosyla- 
tion is a minor reaction also catalyzed by jS-galactosidase. 



Francois Jacob lacques Monod, 1910-1976 
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FIGURE 28-7 The lac operon. (a) The lac operon in the repressed 
state. The / gene encodes the Lac repressor. The lac Z, Y, and A genes 
encode /3-galactosidase, galactoside permease, and thiogalactoside 
transacetylase, respectively. P is the promoter for the lac genes, and 
Pi is the promoter for the / gene. Oi is the main operator for the lac 
operon; 0 2 and 0 3 are secondary operator sites of lesser affinity for 
the Lac repressor, (b) The Lac repressor binds to the main operator 
and 0 2 or 0 3 , apparently forming a loop in the DNA that might wrap 
around the repressor as shown, (c) Lac repressor bound to DNA (de¬ 
rived from PDB ID 1 LBG). This shows the protein (gray) bound to short, 


discontinuous segments of DNA (blue), (d) Conformational change in 
the Lac repressor caused by binding of the artificial inducer iso- 
propylthiogalactoside, IPTG (derived from PDB ID 1LBH and 1LBG). 
The structure of the tetrameric repressor is shown without IPTG bound 
(transparent image) and with IPTG bound (overlaid solid image; IPTG 
not shown). The DNA bound when IPTG is absent (transparent struc¬ 
ture) is not shown. When IPTG is bound and DNA is not bound, the 
repressor's DNA-binding domains are too disordered to be defined in 
the crystal structure. 


of that gene (Chapter 27). Regulation of the lac operon 
by the lac repressor protein (Lac) follows the pattern 
outlined in Figure 28-4a. 

The study of lac operon mutants has revealed some 
details of the workings of the operon’s regulatory sys¬ 
tem. In the absence of lactose, the lac operon genes are 
repressed. Mutations in the operator or in another gene, 
the I gene, result in constitutive synthesis of the gene 
products. When the I gene is defective, repression can 
be restored by introducing a functional I gene into the 
cell on another DNA molecule, demonstrating that the 
I gene encodes a diffusible molecule that causes gene 
repression. This molecule proved to be a protein, now 


called the Lac repressor, a tetramer of identical 
monomers. The operator to which it binds most tightly 
(Oi) abuts the transcription start site (Fig. 28-7a). The 
I gene is transcribed from its own promoter (P£) inde¬ 
pendent of the lac operon genes. The lac operon has 
two secondary binding sites for the Lac repressor. One 
(0 2 ) is centered near position +410, within the gene 
encoding /3-galactosidase ( 'Z) \ the other (0 3 ) is near po¬ 
sition — 90, within the / gene. To repress the operon, the 
Lac repressor appears to bind to both the main opera¬ 
tor and one of the two secondary sites, with the inter¬ 
vening DNA looped out (Fig. 28-7b, c). Either binding 
arrangement blocks transcription initiation. 
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Despite this elaborate binding complex, repression 
is not absolute. Binding of the Lac repressor reduces 
the rate of transcription initiation by a factor of 10 3 . If 
the 0 2 and 0 3 sites are eliminated by deletion or muta¬ 
tion, the binding of repressor to 0 3 alone reduces tran¬ 
scription by a factor of about 10 2 . Even in the repressed 
state, each cell has a few molecules of /3-galactosidase 
and galactoside permease, presumably synthesized on 
the rare occasions when the repressor transiently dis¬ 
sociates from the operators. This basal level of tran¬ 
scription is essential to operon regulation. 

When cells are provided with lactose, the lac operon 
is induced. An inducer (signal) molecule binds to a spe¬ 
cific site on the Lac repressor, causing a conformational 
change (Fig. 28-7d) that results in dissociation of the 
repressor from the operator. The inducer in the lac 
operon system is not lactose itself but allolactose, an 
isomer of lactose (Fig. 28-6). After entry into the E. 
coli cell (via the few existing molecules of permease), 
lactose is converted to allolactose by one of the few ex¬ 
isting /3-galactosidase molecules. Release of the opera¬ 
tor by Lac repressor, triggered as the repressor binds to 
allolactose, allows expression of the lac operon genes 
and leads to a 10 3 -fold increase in the concentration of 
/3-galactosidase. 

Several /3-galactosides structurally related to allo¬ 
lactose are inducers of the lac operon but are not sub¬ 
strates for /3-galactosidase; others are substrates but not 
inducers. One particularly effective and nonmetaboliz- 
able inducer of the lac operon that is often used ex¬ 
perimentally is isopropylthiogalactoside (IPTG): 
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An inducer that cannot be metabolized allows researchers 
to explore the physiological function of lactose as a car¬ 
bon source for growth, separate from its function in the 
regulation of gene expression. 

In addition to the multitude of operons now known 
in bacteria, a few polycistronic operons have been found 
in the cells of lower eukaryotes. In the cells of higher 
eukaryotes, however, almost all protein-encoding genes 
are transcribed separately. 

The mechanisms by which operons are regulated 
can vary significantly from the simple model presented 
in Figure 28-7. Even the lac operon is more complex 
than indicated here, with an activator also contributing 
to the overall scheme, as we shall see in Section 28.2. 
Before any further discussion of the layers of regulation 
of gene expression, however, we examine the critical 
molecular interactions between DNA-binding proteins 
(such as repressors and activators) and the DNA se¬ 
quences to which they bind. 

Regulatory Proteins Have Discrete 
DNA-Binding Domains 

Regulatory proteins generally bind to specific DNA se¬ 
quences. Their affinity for these target sequences is 
roughly 10 4 to 10 6 times higher than their affinity for 
any other DNA sequences. Most regulatory proteins 
have discrete DNA-binding domains containing sub¬ 
structures that interact closely and specifically with the 
DNA. These binding domains usually include one or 
more of a relatively small group of recognizable and 
characteristic structural motifs. 

To bind specifically to DNA sequences, regulatory 
proteins must recognize surface features on the DNA. 
Most of the chemical groups that differ among the four 
bases and thus permit discrimination between base pairs 
are hydrogen-bond donor and acceptor groups exposed 
in the major groove of DNA (Fig. 28-8), and most of the 
protein-DNA contacts that impart specificity are hydro¬ 
gen bonds. A notable exception is the nonpolar surface 
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Adenine = Thymine Guanine = Cytosine 

FIGURE 28-8 Groups in DNA available for protein binding. Shown 
here are functional groups on all four base pairs that are displayed in 


Major groove Major groove 
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the major and minor grooves of DNA. Groups that can be used for 
base-pair recognition by proteins are shown in red. 
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FIGURE 28-9 Two examples of specific amino acid-base pair inter¬ 
actions that have been observed in DNA-protein binding. 


near C-5 of pyrimidines, where thymine is readily dis¬ 
tinguished from cytosine by its protruding methyl group. 
Protein-DNA contacts are also possible in the minor 
groove of the DNA, but the hydrogen-bonding patterns 
here generally do not allow ready discrimination be¬ 
tween base pairs. 

Within regulatory proteins, the amino acid side 
chains most often hydrogen-bonding to bases in the 
DNA are those of Asn, Gin, Glu, Lys, and Arg residues. 
Is there a simple recognition code in which a particular 
amino acid always pairs with a particular base? The two 
hydrogen bonds that can form between Gin or Asn and 
the N 6 and N-7 positions of adenine cannot form with 
any other base. And an Arg residue can form two hy¬ 
drogen bonds with N-7 and O 6 of guanine (Fig. 28-9). 
Examination of the structures of many DNA-binding 
proteins, however, has shown that a protein can recog¬ 
nize each base pair in more than one way, leading to the 
conclusion that there is no simple amino acid-base code. 
For some proteins, the Gln-adenine interaction can 
specify A=T base pairs, but in others a van der Waals 
pocket for the methyl group of thymine can recognize 
A=T base pairs. Researchers cannot yet examine the 
structure of a DNA-binding protein and infer the DNA 
sequence to which it binds. 


To interact with bases in the major groove of DNA, 
a protein requires a relatively small structure that can 
stably protrude from the protein surface. The DNA- 
binding domains of regulatory proteins tend to be small 
(60 to 90 amino acid residues), and the structural mo¬ 
tifs within these domains that are actually in contact 
with the DNA are smaller still. Many small proteins are 
unstable because of their limited capacity to form lay¬ 
ers of structure to bury hydrophobic groups (p. 118). 
The DNA-binding motifs provide either a very compact 
stable structure or a way of allowing a segment of pro¬ 
tein to protrude from the protein surface. 

The DNA-binding sites for regulatory proteins are 
often inverted repeats of a short DNA sequence (a palin¬ 
drome) at which multiple (usually two) subunits of a 
regulatory protein bind cooperatively. The Lac repres¬ 
sor is unusual in that it functions as a tetramer, with two 
dimers tethered together at the end distant from the 
DNA-binding sites (Fig. 28-7b). AnE. coli cell normally 
contains about 20 tetramers of the Lac repressor. Each 
of the tethered dimers separately binds to a palindromic 
operator sequence, in contact with 17 bp of a 22 bp re¬ 
gion in the lac operon (Fig. 28-10). And each of the 
tethered dimers can independently bind to an operator 
sequence, with one generally binding to Oj and the other 
to 0 2 or 0 3 (as in Fig. 28-7b). The symmetry of the 0 1 
operator sequence corresponds to the twofold axis of 
symmetry of two paired Lac repressor subunits. The 
tetrameric Lac repressor binds to its operator sequences 
in vivo with an estimated dissociation constant of about 
10~ 10 m. The repressor discriminates between the op¬ 
erators and other sequences by a factor of about 10 6 , so 
binding to these few base pairs among the 4.6 million 
or so of the E. coli chromosome is highly specific. 

Several DNA-binding motifs have been described, 
but here we focus on two that play prominent roles in 
the binding of DNA by regulatory proteins: the helix- 
turn-helix and the zinc finger. We also consider a type 
of DNA-binding domain—the homeodomain—found in 
some eukaryotic proteins. 

Helix-Tum-Helix This DNA-binding motif is crucial to the 
interaction of many prokaryotic regulatory proteins with 
DNA, and similar motifs occur in some eukaryotic reg¬ 
ulatory proteins. The helix-turn-helix motif comprises 
about 20 amino acids in two short a-helical segments, 
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FIGURE 28-10 Relationship between the lac operator sequence O, 

and the lac promoter. The bases shaded beige exhibit twofold (palin¬ 
dromic) symmetry about the axis indicated by the dashed vertical line. 
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each seven to nine amino acid residues long, separated 
by a f3 turn (Fig. 28-11). This structure generally is not 
stable by itself; it is simply the reactive portion of a 
somewhat larger DNA-binding domain. One of the two 
a-helical segments is called the recognition helix, be¬ 
cause it usually contains many of the amino acids that 


interact with the DNA in a sequence-specific way. This 
a helix is stacked on other segments of the protein 
structure so that it protrudes from the protein surface. 
When bound to DNA, the recognition helix is positioned 
in or nearly in the major groove. The Lac repressor has 
this DNA-binding motif (Fig. 28-11). 





(c) 


FIGURE 28-11 Helix-turn-helix, (a) DNA-binding domain of the Lac 
repressor (PDB ID 1 LCC). The helix-turn-helix motif is shown in red 
and orange; the DNA recognition helix is red. (b) Entire Lac repres¬ 
sor (derived from PDB ID 1 LBG). The DNA-binding domains are gray, 
and the a helices involved in tetramerization are red. The remainder 
of the protein (shades of green) has the binding sites for allolactose. 
The allolactose-binding domains are linked to the DNA-binding do¬ 
mains through linker helices (yellow), (c) Surface rendering of the 




(d) 


DNA-binding domain of the Lac repressor (gray) bound to DNA (blue), 
(d) The same DNA-binding domain as in (c), but separated from the 
DNA, with the binding interaction surfaces shown. Some groups on 
the protein and DNA that interact through hydrogen-bonding are 
shown in red; some groups that interact through hydrophobic inter¬ 
actions are in orange. This model shows only a few of the groups in¬ 
volved in sequence recognition. The complementary nature of the two 
surfaces is evident. 
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Zinc Finger In a zinc finger, about 30 amino acid 
residues form an elongated loop held together at the 
base by a single Zn 2+ ion, which is coordinated to four 
of the residues (four Cys, or two Cys and two His). The 
zinc does not itself interact with DNA; rather, the coor¬ 
dination of zinc with the amino acid residues stabilizes 
this small structural motif. Several hydrophobic side 
chains in the core of the structure also lend stability. 
Figure 28-12 shows the interaction between DNA and 
three zinc fingers of a single polypeptide from the mouse 
regulatory protein Zif268. 

Many eukaryotic DNA-binding proteins contain zinc 
fingers. The interaction of a single zinc finger with DNA 
is typically weak, and many DNA-binding proteins, like 
Zif268, have multiple zinc fingers that substantially en¬ 
hance binding by interacting simultaneously with the 
DNA. One DNA-binding protein of the frog Xenopus has 
37 zinc fingers. There are few known examples of the 
zinc finger motif in prokaryotic proteins. 

The precise manner in which proteins with zinc fin¬ 
gers bind to DNA differs from one protein to the next. 
Some zinc fingers contain the amino acid residues that 
are important in sequence discrimination, whereas oth¬ 
ers appear to bind DNA nonspecifically (the amino acids 
required for specificity are located elsewhere in the 
protein). Zinc fingers can also function as RNA-binding 
motifs—for example, in certain proteins that bind eu¬ 
karyotic mRNAs and act as translational repressors. We 
discuss this role later (Section 28.3). 

Homeodomain Another type of DNA-binding domain has 
been identified in a number of proteins that fmrction as 
transcriptional regulators, especially during eukaryotic 


FIGURE 28-12 Zinc fingers. Three zinc fingers (gray) of the regula¬ 
tory protein Zif268, complexed with DNA (blue and white) (PDB ID 
1A1L). Each Zrr f (maroon) coordinates with two His and two Cys 
residues (not shown). 


FIGURE 28-13 Homeodomain. Shown here is a homeodomain 
bound to DNA; one of the a helices (red), stacked on two others, can 
be seen protruding into the major groove (PDB ID 1 B8I). This is only 
a small part of the much larger protein Ultrabithorax (Ubx), active in 
the regulation of development in fruit flies. 


development. This domain of 60 amino acids—called the 
homeodomain, because it was discovered in homeotic 
genes (genes that regulate the development of body pat¬ 
terns)—is highly conserved and has now been identified 
in proteins from a wide variety of organisms, including 
humans (Fig. 28-13). The DNA-binding segment of the 
domain is related to the helix-turn-helix motif. The DNA 
sequence that encodes this domain is known as the 
homeobox. 

Regulatory Proteins Also Have Protein-Protein 
Interaction Domains 

Regulatory proteins contain domains not only for DNA 
binding but also for protein-protein interactions—with 
RNA polymerase, other regulatory proteins, or other sub¬ 
units of the same regulatory protein. Examples include 
many eukaryotic transcription factors that function as 
gene activators, which often bind as dimers to the DNA, 
using DNA-binding domains that contain zinc fingers. 
Some structural domains are devoted to the interactions 
required for dimer formation, which is generally a pre¬ 
requisite for DNA binding. Like DNA-binding motifs, the 
structural motifs that mediate protein-protein interac¬ 
tions tend to fall within one of a few common categories. 
Two important examples are the leucine zipper and 
the basic helix-loop-helix. Structural motifs such as 
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these are the basis for classifying some regulatory pro¬ 
teins into structural families. 

Leucine Zipper This motif is an amphipathic a helix with 
a series of hydrophobic amino acid residues concen¬ 
trated on one side (Fig. 28-14), with the hydrophobic 
surface forming the area of contact between the two 
polypeptides of a dimer. A striking feature of these 
a helices is the occurrence of Leu residues at every 
seventh position, forming a straight line along the 
hydrophobic surface. Although researchers initially 
thought the Leu residues interdigitated (hence the 
name “zipper”), we now know that they line up side by 
side as the interacting a helices coil around each other 
(forming a coiled coil; Fig. 28-14b). Regulatory proteins 
with leucine zippers often have a separate DNA-binding 
domain with a high concentration of basic (Lys or Arg) 
residues that can interact with the negatively charged 
phosphates of the DNA backbone. Leucine zippers have 
been found in many eukaryotic and a few prokaryotic 
proteins. 

Basic Helix-Loop-Helix Another common structural motif 
occurs in some eukaryotic regulatory proteins implicated 


in the control of gene expression during the develop¬ 
ment of multicellular organisms. These proteins share a 
conserved region of about 50 amino acid residues im¬ 
portant in both DNA binding and protein dimerization. 
This region can form two short amphipathic a helices 
linked by a loop of variable length, the helix-loop-helix 
(distinct from the helix-turn-helix motif associated 
with DNA binding). The helix-loop-helix motifs of two 
polypeptides interact to form dimers (Fig. 28-15). In 
these proteins, DNA binding is mediated by an adjacent 
short amino acid sequence rich in basic residues, simi¬ 
lar to the separate DNA-binding region in proteins con¬ 
taining leucine zippers. 

Subunit Mixing in Eukaryotic Regulatory Proteins Several 
families of eukaryotic transcription factors have been 
defined based on close structural similarities. Within 
each family, dimers can sometimes form between two 
identical proteins (a homodimer) or between two dif¬ 
ferent members of the family (a heterodimer). A hypo¬ 
thetical family of four different leucine-zipper proteins 
could thus form up to ten different dimeric species. In 
many cases, the different combinations appear to have 
distinct regulatory and functional properties. 
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FIGURE 28-14 Leucine zippers, (a) Comparison of 
amino acid sequences of several leucine zipper 
proteins. Note the Leu (L) residues at every seventh 
position in the zipper region, and the number of Lys 
(K) and Arg (R) residues in the DNA-binding region. 

(b) Leucine zipper from the yeast activator protein 
CCN4 (PDB ID 1YSA). Only the "zippered" a helices 
(gray and light blue), derived from different subunits of 
the dimeric protein, are shown. The two helices wrap 
around each other in a gently coiled coil. The inter¬ 
acting Leu residues are shown in red. 
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FIGURE 28-15 Helix-loop-helix .The human transcription factor Max, 
bound to its DNA target site (PDB ID 1 HLO). The protein is dimeric; 
one subunit is colored. The DNA-binding segment (pink) merges with 
the first helix of the helix-loop-helix (red). The second helix merges 
with the carboxyl-terminal end of the subunit (purple). Interaction of 
the carboxyl-terminal helices of the two subunits describes a coiled 
coil very similar to that of a leucine zipper (see Fig. 28-14b), but with 
only one pair of interacting Leu residues (red side chains near the top) 
in this particular example. The overall structure is sometimes called a 
helix-loop-helix/leucine zipper motif. 


In addition to structural domains devoted to DNA 
binding and dimerization (or oligomerization), many 
regulatory proteins must interact with RNA polymerase, 
with unrelated regulatory proteins, or with both. At least 
three different types of additional domains for protein- 
protein interaction have been characterized (primarily 
in eukaryotes): glutamine-rich, proline-rich, and acidic 
domains, the names reflecting the amino acid residues 
that are especially abundant. 

Protein-DNA binding interactions are the basis of 
the intricate regulatory circuits fundamental to gene 
function. We now turn to a closer examination of these 
gene regulatory schemes, first in prokaryotic, then in 
eukaryotic systems. 

SUMMARY 28.1 Principles of Gene Regulation 


■ The expression of genes is regulated by 
processes that affect the rates at which gene 
products are synthesized and degraded. Much 
of this regulation occurs at the level of 
transcription initiation, mediated by regulatory 
proteins that either repress transcription 


(negative regulation) or activate transcription 
(positive regulation) at specific promoters. 

■ In bacteria, genes that encode products with 
interdependent functions are often clustered in 
an operon, a single transcriptional miit. 
Transcription of the genes is generally blocked 
by binding of a specific repressor protein at a 
DNA site called an operator. Dissociation of the 
repressor from the operator is mediated by a 
specific small molecule, an inducer. These 
principles were first elucidated in studies of the 
lactose ( [lac ) operon. The Lac repressor 
dissociates from the lac operator when the 
repressor binds to its inducer, allolactose. 

■ Regulatory proteins are DNA-binding proteins 
that recognize specific DNA sequences; most 
have distinct DNA-binding domains. Within 
these domains, common structural motifs that 
bind DNA are the helix-turn-helix, zinc finger, 
and homeodomain. 

■ Regulatory proteins also contain domains for 
protein-protein interactions, including the 
leucine zipper and helix-loop-helix, which are 
involved in dimerization, and other motifs 
involved in activation of transcription. 

28.2 Regulation of Gene Expression 
in Prokaryotes 

As in many other areas of biochemical investigation, the 
study of the regulation of gene expression advanced ear¬ 
lier and faster in bacteria than in other experimental or¬ 
ganisms. The examples of bacterial gene regulation pre¬ 
sented here are chosen from among scores of 
well-studied systems, partly for their historical signifi¬ 
cance, but primarily because they provide a good 
overview of the range of regulatory mechanisms em¬ 
ployed in prokaryotes. Many of the principles of prokary¬ 
otic gene regulation are also relevant to understanding 
gene expression in eukaryotic cells. 

We begin by examining the lactose and tryptophan 
operons; each system has regulatory proteins, but the 
overall mechanisms of regulation are very different. This 
is followed by a short discussion of the SOS response in 
E. coli, illustrating how genes scattered throughout the 
genome can be coordinately regulated. We then describe 
two prokaryotic systems of quite different types, illus¬ 
trating the diversity of gene regulatory mechanisms: 
regulation of ribosomal protein synthesis at the level of 
translation, with many of the regulatory proteins bind¬ 
ing to RNA (rather than DNA), and regulation of a 
process called phase variation in Salmonella, which re¬ 
sults from genetic recombination. First, we return to the 
lac operon to examine its features in greater detail. 
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The lac Operon Undergoes Positive Regulation 

The operator-repressor-inducer interactions described 
earlier for the lac operon (Fig. 28-7) provide an intu¬ 
itively satisfying model for an on/off switch in the reg¬ 
ulation of gene expression. In truth, operon regulation 
is rarely so simple. A bacterium’s environment is too 
complex for its genes to be controlled by one signal. 
Other factors besides lactose affect the expression of 
the lac genes, such as the availability of glucose. Glu¬ 
cose, metabolized directly by glycolysis, is E. coli's pre¬ 
ferred energy source. Other sugars can serve as the main 
or sole nutrient, but extra steps are required to prepare 
them for entry into glycolysis, necessitating the syn¬ 
thesis of additional enzymes. Clearly, expressing the 
genes for proteins that metabolize sugars such as lac¬ 
tose or arabinose is wasteful when glucose is abundant. 

What happens to the expression of the lac operon 
when both glucose and lactose are present? A regula¬ 
tory mechanism known as catabolite repression re¬ 
stricts expression of the genes required for catabolism 
of lactose, arabinose, and other sugars in the presence 
of glucose, even when these secondary sugars are also 
present. The effect of glucose is mediated by cAMP, as 
a coactivator, and an activator protein known as cAMP 
receptor protein, or CRP (the protein is sometimes 
called CAP, for catabolite gene activator protein). CRP 
is a homodimer (subunit M r 22,000) with binding sites 
for DNA and cAMP. Binding is mediated by a helix-turn- 
helix motif within the protein’s DNA-binding domain 
(Fig. 28-16). When glucose is absent, CRP-cAMP binds 
to a site near the lac promoter (Fig. 28-17a) and stim¬ 
ulates RNA transcription 50-fold. CRP-cAMP is there¬ 
fore a positive regulatory element responsive to glucose 
levels, whereas the Lac repressor is a negative regula¬ 
tory element responsive to lactose. The two act in con- 



FIGURE 28-16 CRP homodimer. (PDB ID 1RUN) Bound molecules 
of cAMP are shown in red. Note the bending of the DNA around the 
protein. The region that interacts with RNA polymerase is shaded 
yellow. 


cert. CRP-cAMP has little effect on the lac operon when 
the Lac repressor is blocking transcription, and dissoci¬ 
ation of the repressor from the lac operator has little 
effect on transcription of the lac operon unless CRP- 
cAMP is present to facilitate transcription; when CRP is 
not bound, the wild-type lac promoter is a relatively 
weak promoter (Fig. 28-17b). The open complex of 
RNA polymerase and the promoter (see Fig. 26-6) does 
not form readily unless CRP-cAMP is present. CRP inter¬ 
acts directly with RNA polymerase (at the region shown 
in Fig. 28-16) through the polymerase’s a subunit. 




CRP site 


Bound by RNA polymerase 5 'X/X/v 3' mRNA 


j I | I 

DNA 5'-ATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACAC 
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Promoter 
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FIGURE 28-17 Activation of transcription of the lac operon by CRP. 

(a) The binding site for CRP-cAMP is near the promoter. As in the case 
of the lac operator, the CRP site has twofold symmetry (bases shaded 
beige) about the axis indicated by the dashed line, (b) Sequence of 


the lac promoter compared with the promoter consensus sequence. 
The differences mean that RNA polymerase binds relatively weakly to 
the lac promoter until the polymerase is activated by CRP-cAMP. 
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FIGURE 28-18 Combined effects of glucose and lactose on expression of the lac operon. (a) High 
levels of transcription take place only when glucose concentrations are low (so cAMP levels are high 
and CRP-cAMP is bound) and lactose concentrations are high (so the Lac repressor is not bound). 

(b) Without bound activator (CRP-cAMP), the lac promoter is poorly transcribed even when lactose 
concentrations are high and the Lac repressor is not bound. 


The effect of glucose on CRP is mediated by the 
cAMP interaction (Fig. 28-18). CRP binds to DNA most 
avidly when cAMP concentrations are high. In the pres¬ 
ence of glucose, the synthesis of cAMP is inhibited and 
efflux of cAMP from the cell is stimulated. As [cAMP] 
declines, CRP binding to DNA declines, thereby de¬ 
creasing the expression of the lac operon. Strong in¬ 
duction of the lac operon therefore requires both lac¬ 
tose (to inactivate the lac repressor) and a lowered 
concentration of glucose (to trigger an increase in 
[cAMP] and increased binding of cAMP to CRP). 

CRP and cAMP are involved in the coordinated reg¬ 
ulation of many operons, primarily those that encode 
enzymes for the metabolism of secondary sugars such 
as lactose and arabinose. A network of operons with a 
common regulator is called a regulon. This arrange¬ 
ment, which allows for coordinated shifts in cellular 
functions that can require the action of hundreds of 
genes, is a major theme in the regulated expression of 
dispersed networks of genes in eukaryotes. Other bac¬ 
terial regulons include the heat-shock gene system that 
responds to changes in temperature (p. 1083) and the 
genes induced in E. coli as part of the SOS response to 
DNA damage, described later. 

Many Genes for Amino Acid Biosynthetic Enzymes Are 
Regulated by Transcription Attenuation 

The 20 common amino acids are required in large 
amounts for protein synthesis, and E. coli can synthe¬ 
size all of them. The genes for the enzymes needed to 
synthesize a given amino acid are generally clustered in 
an operon and are expressed whenever existing supplies 
of that amino acid are inadequate for cellular require¬ 
ments. When the amino acid is abundant, the biosyn¬ 


thetic enzymes are not needed and the operon is 
repressed. 

The E. coli tryptophan ( trp ) operon (Fig. 28-19) 
includes five genes for the enzymes required to convert 
chorismate to tryptophan. Note that two of the enzymes 
catalyze more than one step in the pathway. The mRNA 
from the trp operon has a half-life of only about 3 min, 
allowing the cell to respond rapidly to changing needs 
for this amino acid. The Trp repressor is a homodimer, 
each subunit containing 107 amino acid residues (Fig. 
28-20). When tryptophan is abundant it binds to the 
Trp repressor, causing a conformational change that 
permits the repressor to bind to the trp operator and 
inhibit expression of the trp operon. The trp operator 
site overlaps the promoter, so binding of the repressor 
blocks binding of RNA polymerase. 

Once again, this simple on/off circuit mediated by a 
repressor is not the entire regulatory story. Different 
cellular concentrations of tryptophan can vary the rate 
of synthesis of the biosynthetic enzymes over a 700-fold 
range. Once repression is lifted and transcription be¬ 
gins, the rate of transcription is fine-tuned by a second 
regulatory process, called transcription attenuation, 
in which transcription is initiated normally but is 
abruptly halted before the operon genes are transcribed. 
The frequency with which transcription is attenuated is 
regulated by the availability of tryptophan and relies on 
the very close coupling of transcription and translation 
in bacteria. 

The trp operon attenuation mechanism uses signals 
encoded in four sequences within a 162 nucleotide 
leader region at the 5' end of the mRNA, preceding the 
initiation codon of the first gene (Fig. 28-2la). Within 
the leader lies a region known as the attenuator, made 
up of sequences 3 and 4. These sequences base-pair to 
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FIGURE 28-19 The trp operon. This operon is regulated by two 



trp mRNA 

(low tryptophan levels) 






FIGURE 28-20 Trp repressor. The repressor is a dimer, with both sub¬ 
units (gray and light blue) binding the DNA at helix-turn-helix motifs 
(PDB ID 1TRO). Bound molecules of tryptophan are in red. 


form a G=C-rich stem-and-loop structure closely fol¬ 
lowed by a series of U residues. The attenuator struc¬ 
ture acts as a transcription terminator (Fig. 28-21b). 
Sequence 2 is an alternative complement for sequence 
3 (Fig. 28-2 lc). If sequences 2 and 3 base-pair, the at¬ 
tenuator structure cannot form and transcription con¬ 
tinues into the trp biosynthetic genes; the loop formed 
by the pairing of sequences 2 and 3 does not obstruct 
transcription. 

Regulatory sequence 1 is crucial for a tryptophan- 
sensitive mechanism that determines whether sequence 
3 pairs with sequence 2 (allowing transcription to con¬ 
tinue) or with sequence 4 (attenuating transcription). 
Formation of the attenuator stem-and-loop structure 
depends on events that occur during translation of reg¬ 
ulatory sequence 1, which encodes a leader peptide (so 
called because it is encoded by the leader region of the 
mRNA) of 14 amino acids, two of which are Trp residues. 
The leader peptide has no other known cellular func¬ 
tion; its synthesis is simply an operon regulatory device. 
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When tryptophan levels are high, the ribosome quickly translates 
sequence 1 (open reading frame encoding leader peptide) and blocks 
sequence 2 before sequence 3 is transcribed. Continued transcription 
leads to attenuation at the terminator-like attenuator structure 
formed by sequences 3 and 4. 



When tryptophan levels are low, the ribosome pauses at the 
Trp codons in sequence 1. Formation of the paired structure 
between sequences 2 and 3 prevents attenuation, because 
sequence 3 is no longer available to form the attenuator 
structure with sequence 4. The 2:3 structure, unlike the 
3:4 attenuator, does not prevent transcription. 
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FIGURE 28-21 Transcriptional attenuation in the trp operon. Tran¬ 
scription is initiated at the beginning of the 162 nucleotide mRNA 
leader encoded by a DNA region called trpL (see Fig. 28-19). A reg¬ 
ulatory mechanism determines whether transcription is attenuated at 
the end of the leader or continues into the structural genes, (a) The 
trp mRNA leader (trpL). The attenuation mechanism in the trp operon 
involves sequences 1 to 4 (highlighted), (b) Sequence 1 encodes a 
small peptide, the leader peptide, containing two Trp residues (W); it 
is translated immediately after transcription begins. Sequences 2 and 


3 are complementary, as are sequences 3 and 4. The attenuator struc¬ 
ture forms by the pairing of sequences 3 and 4 (top). Its structure and 
function are similar to those of a transcription terminator (see Fig. 
26-7). Pairing of sequences 2 and 3 (bottom) prevents the attenuator 
structure from forming. Note that the leader peptide has no other cel¬ 
lular function. Translation of its open reading frame has a purely reg¬ 
ulatory role that determines which complementary sequences (2 and 
3 or 3 and 4) are paired, (c) Base-pairing schemes for the comple¬ 
mentary regions of the trp mRNA leader. 
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This peptide is translated immediately after it is tran¬ 
scribed, by a ribosome that follows closely behind RNA 
polymerase as transcription proceeds. 

When tryptophan concentrations are high, concen¬ 
trations of charged tryptophan tRNA (Trp-tRNA Trp ) are 
also high. This allows translation to proceed rapidly past 
the two Trp codons of sequence 1 and into sequence 2, 
before sequence 3 is synthesized by RNA polymerase. 
In this situation, sequence 2 is covered by the ribosome 
and unavailable for pairing to sequence 3 when se¬ 
quence 3 is synthesized; the attenuator structure (se¬ 
quences 3 and 4) forms and transcription halts (Fig. 
28-2lb, top). When tryptophan concentrations are low, 
however, the ribosome stalls at the two Trp codons in 
sequence 1, because charged tRNA Trp is less available. 
Sequence 2 remains free while sequence 3 is synthe¬ 
sized, allowing these two sequences to base-pair and 
permitting transcription to proceed (Fig. 28-2lb, bot¬ 
tom). In this way, the proportion of transcripts that 
are attenuated declines as tryptophan concentration 
declines. 

Many other amino acid biosynthetic operons use a 
similar attenuation strategy to fine-tune biosynthetic en¬ 
zymes to meet the prevailing cellular requirements. The 


15 amino acid leader peptide produced by the phe 
operon contains seven Phe residues. The leu operon 
leader peptide has four contiguous Leu residues. The 
leader peptide for the his operon contains seven con¬ 
tiguous His residues. In fact, in the his operon and a 
number of others, attenuation is sufficiently sensitive to 
be the only regulatory mechanism. 

Induction of the SOS Response Requires Destruction 
of Repressor Proteins 

Extensive DNA damage in the bacterial chromosome 
triggers the induction of many distantly located genes. 
This response, called the SOS response (p. 976), pro¬ 
vides another good example of coordinated gene regu¬ 
lation. Many of the induced genes are involved in DNA 
repair (see Table 25-6). The key regulatory proteins are 
the RecA protein and the LexA repressor. 

The LexA repressor (M r 22,700) inhibits transcrip¬ 
tion of all the SOS genes (Fig. 28-22), and induction 
of the SOS response requires removal of LexA. This is 
not a simple dissociation from DNA in response to bind¬ 
ing of a small molecule, as in the regulation of the lac 
operon described above. Instead, the LexA repressor is 
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FIGURE 28-22 SOS response in E. coli. See Table 
25-6 for the functions of many of these proteins. 
The LexA protein is the repressor in this system, 
which has an operator site (red) near each gene. 
Because the recA gene is not entirely repressed by 
the LexA repressor, the normal cell contains about 
1,000 RecA monomers. (T) When DNA is exten¬ 
sively damaged (e.g., by UV light), DNA replication 
is halted and the number of single-strand gaps in 
the DNA increases. ( 5 ) RecA protein binds to this 
damaged, single-stranded DNA, activating the 
protein's coprotease activity. ( 5 ) While bound to 
DNA, the RecA protein facilitates cleavage and 
inactivation of the LexA repressor. When the 
repressor is inactivated, the SOS genes, including 
recA, are induced; RecA levels increase 50- to 
100-fold. 
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inactivated when it catalyzes its own cleavage at a spe¬ 
cific Ala-Gly peptide bond, producing two roughly 
equal protein fragments. At physiological pH, this au¬ 
tocleavage reaction requires the RecA protein. RecA is 
not a protease in the classical sense, but its interaction 
with LexA facilitates the repressor’s self-cleavage reac¬ 
tion. This function of RecA is sometimes called a co¬ 
protease activity. 

The RecA protein provides the functional link be¬ 
tween the biological signal (DNA damage) and induc¬ 
tion of the SOS genes. Heavy DNA damage leads to nu¬ 
merous single-strand gaps in the DNA, and only RecA 
that is bound to single-stranded DNA can facilitate 
cleavage of the LexA repressor (Fig. 28-22, bottom). 
Binding of RecA at the gaps eventually activates its co¬ 
protease activity, leading to cleavage of the LexA re¬ 
pressor and SOS induction. 

During induction of the SOS response in a severely 
damaged cell, RecA also cleaves and thus inactivates the 
repressors that otherwise allow propagation of certain 
viruses in a dormant lysogenic state within the bacter¬ 
ial host. This provides a remarkable illustration of evo¬ 
lutionary adaptation. These repressors, like LexA, also 
undergo self-cleavage at a specific Ala-Gly peptide 
bond, so induction of the SOS response permits repli¬ 
cation of the virus and lysis of the cell, releasing new 
viral particles. Thus the bacteriophage can make a hasty 
exit from a compromised bacterial host cell. 

Synthesis of Ribosomal Proteins Is Coordinated 
with rRNA Synthesis 

In bacteria, an increased cellular demand for protein 
synthesis is met by increasing the number of ribosomes 
rather than altering the activity of individual ribosomes. 
In general, the number of ribosomes increases as the 
cellular growth rate increases. At high growth rates, ri¬ 
bosomes make up approximately 45% of the cell’s dry 
weight. The proportion of cellular resources devoted to 
making ribosomes is so large, and the function of ribo¬ 
somes so important, that cells must coordinate the syn¬ 
thesis of the ribosomal components: the ribosomal pro¬ 
teins (r-proteins) and RNAs (rRNAs). This regulation is 
distinct from the mechanisms described so far, because 
it occurs largely at the level of translation. 

The 52 genes that encode the r-proteins occur in at 
least 20 operons, each with 1 to 11 genes. Some of these 
operons also contain the genes for the subunits of 
DNA primase (see Fig. 25-13), RNA polymerase (see 
Fig. 26-4), and protein synthesis elongation factors (see 
Fig. 27-23)—revealing the close coupling of replication, 
transcription, and protein synthesis during cell growth. 

The r-protein operons are regulated primarily 
through a translational feedback mechanism. One 
r-protein encoded by each operon also functions as a 
translational repressor, which binds to the mRNA 


transcribed from that operon and blocks translation of 
all the genes the messenger encodes (Fig. 28-23). In 
general, the r-protein that plays the role of repressor 
also binds directly to an rRNA. Each translational re¬ 
pressor r-protein binds with higher affinity to the ap¬ 
propriate rRNA than to its mRNA, so the mRNA is bound 
and translation repressed only when the level of the 
r-protein exceeds that of the rRNA. This ensures that 
translation of the mRNAs encoding r-proteins is re¬ 
pressed only when synthesis of these r-proteins exceeds 
that needed to make fmrctional ribosomes. In this way, 
the rate of r-protein synthesis is kept in balance with 
rRNA availability. 

The mRNA binding site for the translational re¬ 
pressor is near the translational start site of one of the 
genes in the operon, usually the first gene (Fig. 28-23). 
In other operons this would affect only that one gene, 
because in bacterial polycistronic mRNAs most genes 
have independent translation signals. In the r-protein 
operons, however, the translation of one gene depends 
on the translation of all the others. The mechanism of 
this translational coupling is not yet understood in de¬ 
tail. However, in some cases the translation of multiple 
genes appears to be blocked by folding of the mRNA 
into an elaborate three-dimensional structure that is sta¬ 
bilized both by internal base-pairing (as in Fig. 8-26) 
and by binding of the translational repressor protein. 
When the translational repressor is absent, ribosome 
binding and translation of one or more of the genes dis¬ 
rupts the folded structure of the mRNA and allows all 
the genes to be translated. 

Because the synthesis of r-proteins is coordinated 
with the available rRNA, the regulation of ribosome pro¬ 
duction reflects the regulation of rRNA synthesis. In E. 
coli, rRNA synthesis from the seven rRNA operons re¬ 
sponds to cellular growth rate and to changes in the 
availability of crucial nutrients, particularly amino acids. 
The regulation coordinated with amino acid concentra¬ 
tions is known as the stringent response (Fig. 28-24). 
When amino acid concentrations are low, rRNA synthe¬ 
sis is halted. Amino acid starvation leads to the binding 
of uncharged tRNAs to the ribosomal A site; this trig¬ 
gers a sequence of events that begins with the binding 
of an enzyme called stringent factor (RelA protein) 
to the ribosome. When bound to the ribosome, stringent 
factor catalyzes formation of the unusual nucleotide 
guanosine tetraphosphate (ppGpp; see Fig. 8-42); it 
adds pyrophosphate to the 3' position of GTP, in the 
reaction 

GTP + ATP-> pppGpp + AMP 

then a phosphohydrolase cleaves off one phosphate to 
form ppGpp. The abrupt rise in ppGpp level in response 
to amino acid starvation results in a great reduction in 
rRNA synthesis, mediated at least in part by the bind¬ 
ing of ppGpp to RNA polymerase. 
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FIGURE 28-23 Translational feedback in some ribosomal 
protein operons. The r-proteins that act as translational 
repressors are shaded pink. Each translational repressor 
blocks the translation of all genes in that operon by binding 
to the indicated site on the mRNA. Genes that encode 
subunits of RNA polymerase are shaded yellow; genes that 
encode elongation factors are blue. The r-proteins of the 
large (50S) ribosomal subunit are designated LI to L34; 
those of the small (30S) subunit, SI to S21. 
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FIGURE 28-24 Stringent response in E. coli. This response 
to amino acid starvation is triggered by binding of an 
uncharged tRNA in the ribosomal A site. A protein called 
stringent factor binds to the ribosome and catalyzes the 
synthesis of pppGpp, which is converted by a phosphohy- 
drolase to ppGpp. The signal ppGpp reduces transcription 
of some genes and increases that of others, in part by 
binding to the j8 subunit of RNA polymerase and altering 
the enzyme's promoter specificity. Synthesis of rRNA is 
reduced when ppGpp levels increase. 
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The nucleotide ppGpp, along with cAMP, belongs to 
a class of modified nucleotides that act as cellular sec¬ 
ond messengers (p. 302). In E. coli, these two nu¬ 
cleotides serve as starvation signals; they cause large 
changes in cellular metabolism by increasing or de¬ 
creasing the transcription of hundreds of genes. In eu¬ 
karyotic cells, similar nucleotide second messengers 
also have multiple regulatory functions. The coordina¬ 
tion of cellular metabolism with cell growth is highly 
complex, and further regulatory mechanisms undoubt¬ 
edly remain to be discovered. 

Some Genes Are Regulated 
by Genetic Recombination 

Salmonella typhimurium , which inhabits the mam¬ 
malian intestine, moves by rotating the flagella on its 
cell surface (Fig. 28-25). The many copies of the pro¬ 
tein flagellin (M r 53,000) that make up the flagella are 
prominent targets of mammalian immune systems. But 
Salmonella cells have a mechanism that evades the im¬ 
mune response: they switch between two distinct fla- 
gellin proteins (FljB and FliC) roughly once every 1,000 
generations, using a process called phase variation. 

The switch is accomplished by periodic inversion of 
a segment of DNA containing the promoter for a fla¬ 
gellin gene. The inversion is a site-specific recombina¬ 
tion reaction (see Fig. 25-39) mediated by the Hin re- 
combinase at specific 14 bp sequences {hix sequences) 



FIGURE 28-25 Salmonella typhimurium, with flagella evident. 

at either end of the DNA segment. When the DNA seg¬ 
ment is in one orientation, the gene for FljB flagellin and 
the gene encoding a repressor (FljA) are expressed 
(Fig. 28-26a); the repressor shuts down expression of 
the gene for FliC flagellin. When the DNA segment is 
inverted (Fig. 28-26b), th efljA and fljB genes are no 
longer transcribed, and the JliC gene is induced as the 
repressor becomes depleted. The Hin recombinase, en¬ 
coded by the hin gene in the DNA segment that un¬ 
dergoes inversion, is expressed when the DNA segment 
is in either orientation, so the cell can always switch 
from one state to the other. 

This type of regulatory mechanism has the advan¬ 
tage of being absolute: gene expression is impossible 


Inverted repeat (hix) 
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FIGURE 28-26 Regulation of flagellin genes 
in Salmonella: phase variation. The products 
of genes fliC and fljB are different flagellins. 
The hin gene encodes the recombinase that 
catalyzes inversion of the DNA segment 
containing the fljB promoter and the hin gene. 
The recombination sites (inverted repeats) are 
called hix (yellow), (a) In one orientation, fljB 
is expressed along with a repressor protein 
(product of the fljA gene) that represses tran¬ 
scription of the fliC gene, (b) In the opposite 
orientation only the fliC gene is expressed; the 
fljA and fljB genes cannot be transcribed. The 
interconversion between these two states, 
known as phase variation, also requires two 
other nonspecific DNA-binding proteins (not 
shown), HU (bistonelike protein from (713, a 
strain of E. coli) and FIS (factor for Inversion 
stimulation). 
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TABLE 28-1 Examples of Gene Regulation by Recombination 


System 

Recombinase/ 
recombination site 

Type of 
recombination 

Function 

Phase variation ( Salmonella) 

Hin /hix 

Site-specific 

Alternative expression of two 
flagellin genes allows evasion 
of host immune response. 

Host range (bacteriophage /a) 

Gin/g/x 

Site-specific 

Alternative expression of two 
sets of tail fiber genes affects 
host range. 

Mating-type switch (yeast) 

HO endonuclease, 

Nonreciprocal 

Alternative expression of two 


RAD52 protein, other 
proteins/MAF 

gene conversion* 

mating types of yeast, 
a and a, creates cells of 
different mating types that can 
mate and undergo meiosis. 

Antigenic variation (trypanosomes) 1 

Varies 

Nonreciprocal gene 
conversion* 

Successive expression of 
different genes encoding the 
variable surface glycoproteins 
(VSGs) allows evasion of host 
immune response. 


*ln nonreciprocal gene conversion (a class of recombination events not discussed in Chapter 25), genetic information is moved from one part of 
the genome (where it is silent) to another (where it is expressed). The reaction is similar to replicative transposition (see Fig. 25-43). 

trypanosomes cause African sleeping sickness and other diseases (see Box 22-2). The outer surface of a trypanosome is made up of multiple 
copies of a single VSG, the major surface antigen. A cell can change surface antigens to more than 100 different forms, precluding an effective 
defense by the host immune system. 


when the gene is physically separated from its promoter 
(note the position of th efljB promoter hi Fig. 28-26b). 
An absolute on/off switch may be important in this sys¬ 
tem (even though it affects only one of the two flagellin 
genes), because a flagellum with just one copy of the 
wrong flagellin might be vulnerable to host antibodies 
against that protein. The Salmonella system is by no 
means unique. Similar regulatory systems occur in a num¬ 
ber of other bacteria and in some bacteriophages, and 
recombination systems with similar functions have been 
found hi eukaryotes (Table 28-1). Gene regulation by 
DNA rearrangements that move genes and/or promot¬ 
ers is particularly common in pathogens that benefit by 
changing their host range or by changing their surface 
proteins, thereby staying ahead of host immune systems. 

SUMMARY 28.2 Regulation of Gene Expression 
in Prokaryotes 


■ In addition to repression by the Lac repressor, 
the E. coli lac operon undergoes positive 
regulation by the cAMP receptor protein 
(CRP). When [glucose] is low, [cAMP] is high 
and CRP-cAMP binds to a specific site on the 
DNA, stimulating transcription of the lac 
operon and production of lactose-metabolizing 
enzymes. The presence of glucose depresses 
[cAMP], decreasing expression of lac and other 


genes involved in metabolism of secondary 
sugars. A group of coordhiately regulated 
operons is referred to as a regulon. 

■ Operons that produce the enzymes of amino 
acid synthesis have a regulatory circuit called 
attenuation, which uses a transcription 
termination site (the attenuator) in the mRNA. 
Formation of the attenuator is modulated by a 
mechanism that couples transcription and 
translation while responding to small changes 
in amino acid concentration. 

■ In the SOS system, multiple unlinked genes 
repressed by a single repressor are induced 
simultaneously when DNA damage triggers 
RecA protein-facilitated autocatalytic 
proteolysis of the repressor. 

■ In the synthesis of ribosomal proteins, one 
protein in each r-protein operon acts as a 
translational repressor. The mRNA is bound by 
the repressor, and translation is blocked only 
when the r-protein is present in excess of 
available rRNA. Some genes are regulated by 
genetic recombination processes that move 
promoters relative to the genes being 
regulated. Regulation can also take place at the 
level of translation. These diverse mechanisms 
permit very sensitive cellular responses to 
environmental change. 
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28.3 Regulation of Gene Expression 
in Eukaryotes 

Initiation of transcription is a crucial regulation point for 
both prokaryotic and eukaryotic gene expression. Al¬ 
though some of the same regulatory mechanisms are 
used in both systems, there is a fundamental difference 
in the regulation of transcription in eukaryotes and 
bacteria. 

We can define a transcriptional ground state as the 
inherent activity of promoters and transcriptional ma¬ 
chinery in vivo in the absence of regulatory sequences. 
In bacteria, RNA polymerase generally has access to 
every promoter and can bind and initiate transcription 
at some level of efficiency in the absence of activators 
or repressors; the transcriptional ground state is there¬ 
fore nonrestrictive. In eukaryotes, however, strong pro¬ 
moters are generally inactive in vivo in the absence of 
regulatory proteins; that is, the transcriptional ground 
state is restrictive. This fundamental difference gives 
rise to at least four important features that distinguish 
the regulation of gene expression in eukaryotes from 
that in bacteria. 

First, access to eukaryotic promoters is restricted 
by the structure of chromatin, and activation of tran¬ 
scription is associated with many changes in chromatin 
structure in the transcribed region. Second, although 
eukaryotic cells have both positive and negative regula¬ 
tory mechanisms, positive mechanisms predominate in 
all systems characterized so far. Thus, given that the 
transcriptional ground state is restrictive, virtually every 
eukaryotic gene requires activation to be transcribed. 
Third, eukaryotic cells have larger, more complex mul¬ 
timeric regulatory proteins than do bacteria. Finally, 
transcription in the eukaryotic nucleus is separated 
from translation in the cytoplasm in both space and 
time. 

The complexity of regulatory circuits in eukaryotic 
cells is extraordinary, as the following discussion shows. 
We conclude the section with an illustrated description 
of one of the most elaborate circuits: the regulatory cas¬ 
cade that controls development in fruit flies. 

Transcriptionally Active Chromatin Is Structurally 
Distinct from Inactive Chromatin 

The effects of chromosome structure on gene regula¬ 
tion in eukaryotes have no clear parallel in prokaryotes. 
In the eukaryotic cell cycle, interphase chromosomes 
appear, at first viewing, to be dispersed and amorphous 
(see Figs 12-41, 24-25). Nevertheless, several forms of 
chromatin can be found along these chromosomes. 
About 10% of the chromatin in a typical eukaryotic cell 
is in a more condensed form than the rest of the chro¬ 
matin. This form, heterochromatin, is transcription¬ 
ally inactive. Heterochromatin is generally associated 


with particular chromosome structures—the cen¬ 
tromeres, for example. The remaining, less condensed 
chromatin is called euchromatin. 

Transcription of a eukaryotic gene is strongly re¬ 
pressed when its DNA is condensed within heterochro¬ 
matin. Some, but not all, of the euchromatin is 
transcriptionally active. Transcriptionally active chro¬ 
mosomal regions can be detected based on their in¬ 
creased sensitivity to nuclease-mediated degradation. 
Nucleases such as DNase I tend to cleave the DNA of 
carefully isolated chromatin into fragments of multiples 
of about 200 bp, reflecting the regular repeating struc¬ 
ture of the nucleosome (see Fig. 24-26). In actively tran¬ 
scribed regions, the fragments produced by nuclease ac¬ 
tivity are smaller and more heterogeneous in size. These 
regions contain hypersensitive sites, sequences es¬ 
pecially sensitive to DNase I, which consist of about 100 
to 200 bp within the 1,000 bp flanking the 5' ends of 
transcribed genes. In some genes, hypersensitive sites 
are found farther from the 5' end, near the 3' end, or 
even within the gene itself. 

Many hypersensitive sites correspond to binding 
sites for known regulatory proteins, and the relative ab¬ 
sence of nucleosomes in these regions may allow the 
binding of these proteins. Nucleosomes are entirely ab¬ 
sent in some regions that are very active in transcrip¬ 
tion, such as the rRNA genes. Transcriptionally active 
chromatin tends to be deficient in histone HI, which 
binds to the linker DNA between nucleosome particles. 

Histones within transcriptionally active chromatin 
and heterochromatin also differ in their patterns of co¬ 
valent modification. The core histones of nucleosome 
particles (H2A, H2B, H3, H4; see Fig. 24-27) are mod¬ 
ified by irreversible methylation of Lys residues, phos¬ 
phorylation of Ser or Thr residues, acetylation (see be¬ 
low), or attachment of ubiquitin (see Fig. 27-41). Each 
of the core histones has two distinct structural domains. 
A central domain is involved in histone-histone interac¬ 
tion and the wrapping of DNA around the nucleosome. 
A second, lysine-rich amino-terminal domain is gener¬ 
ally positioned near the exterior of the assembled nu¬ 
cleosome particle; the covalent modifications occur at 
specific residues concentrated in this amino-terminal 
domain. The patterns of modification have led some re¬ 
searchers to propose the existence of a histone code, in 
which modification patterns are recognized by enzymes 
that alter the structure of chromatin. Modifications as¬ 
sociated with transcriptional activation would be recog¬ 
nized by enzymes that make the chromatin more ac¬ 
cessible to the transcription machinery. 

5-Methylation of cytosine residues of CpG se¬ 
quences is common in eukaryotic DNA (p. 296), but 
DNA in transcriptionally active chromatin tends to be 
undermethylated. Furthermore, CpG sites in particular 
genes are more often undermethylated in cells from tis¬ 
sues where the genes are expressed than in those where 
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the genes are not expressed. The overall pattern sug¬ 
gests that active chromatin is prepared for transcription 
by the removal of potential structural barriers. 

Chromatin Is Remodeled by Acetylation and 
Nucleosomal Displacements 

The detailed mechanisms for transcription-associated 
structural changes in chromatin, called chromatin re¬ 
modeling, are now coming to light, including identifi¬ 
cation of a variety of enzymes directly implicated in the 
process. These include enzymes that covalently modify 
the core histones of the nucleosome and others that use 
the chemical energy of ATP to remodel nucleosomes on 
the DNA (Table 28-2). 

The acetylation and deacetylation of histones figure 
prominently in the processes that activate chromatin 
for transcription. As noted above, the amino-terminal 
domains of the core histones are generally rich in Lys 
residues. Particular Lys residues are acetylated by 
histone acetyltransferases (HATs). Cytosolic (type B) 
HATs acetylate newly synthesized histones before the 
histones are imported into the nucleus. The subsequent 
assembly of the histones into chromatin is facilitated by 
additional proteins: CAF1 for H3 and H4, and NAP1 for 
H2A and H2B. (See Table 28-2 for an explanation of 
some of these abbreviated names.) 

Where chromatin is being activated for transcrip¬ 
tion, the nucleosomal histones are further acetylated by 
nuclear (type A) HATs. The acetylation of multiple Lys 
residues in the amino-terminal domains of histones H3 
and H4 can reduce the affinity of the entire nucleosome 
for DNA. Acetylation may also prevent or promote in¬ 
teractions with other proteins involved in transcription 
or its regulation. When transcription of a gene is no 


longer required, the acetylation of nucleosomes in that 
vicinity is reduced by the activity of histone deacety- 
lases, as part of a general gene-silencing process that 
restores the chromatin to a transcriptionally inactive 
state. In addition to the removal of certain acetyl groups, 
new covalent modification of histones marks chromatin 
as transcriptionally inactive. As an example, the Lys 
residue at position 9 in histone H3 is often methylated 
in heterochromatin. 

Chromatin remodeling also requires protein com¬ 
plexes that actively move or displace nucleosomes, hy¬ 
drolyzing ATP in the process (Table 28-2). The enzyme 
complex SWI/SNF found in all eukaryotic cells, contains 
11 polypeptides (total M r 2 X 10 6 ) that together create 
hypersensitive sites in the chromatin and stimulate the 
binding of transcription factors. SWI/SNF is not required 
for the transcription of every gene. NURF is another 
ATP-dependent enzyme complex that remodels chro¬ 
matin in ways that complement and overlap the activ¬ 
ity of SWI/SNF. These enzyme complexes play an im¬ 
portant role in preparing a region of chromatin for active 
transcription. 

Many Eukaryotic Promoters Are Positively Regulated 

As already noted, eukaryotic RNA polymerases have lit¬ 
tle or no intrinsic affinity for their promoters; initiation 
of transcription is almost always dependent on the 
action of multiple activator proteins. One important 
reason for the apparent predominance of positive regu¬ 
lation seems obvious: the storage of DNA within chro¬ 
matin effectively renders most promoters inaccessible, 
so genes are normally silent in the absence of other reg¬ 
ulation. The structure of chromatin affects access to 
some promoters more than others, but repressors that 


TABLE 28-2 

Some Enzyme Complexes Catalyzing Chromatin Structural Changes Associated with Transcription 

Enzyme complex' 

Oligomeric structure 
(number of polypeptides) 

Source 

Activities 

GCN5-ADA2-ADA3 

3 

Yeast 

GCN5 has type A HAT activity 

SAGA/PCAF 

>20 

Eukaryotes 

Includes GCN5-ADA2-ADA3 

SWI/SNF 

11; total M r 2 X 10® 

Eukaryotes 

ATP-dependent nucleosome remodeling 

NURF 

4; total M r 500,000 

Drosophila 

ATP-dependent nucleosome remodeling 

CAFI 

>2 

Humans; Drosophila 

Responsible for binding histones H3 
and H4 to DNA 

NAP1 

1; M r 125,000 

Widely distributed in 
eukaryotes 

Responsible for binding histones H2A 
and H2B to DNA 


*The abbreviations for eukaryotic genes and proteins are often more confusing or obscure than those used for bacteria. The complex of GCN5 
(general control nonderepressible) and ADA (alteration/c/eficiency activation) proteins was discovered during investigation of the regulation of 
nitrogen metabolism genes in yeast. These proteins can be part of the larger SAGA complex (SPF, ADA2.3, GCN5, acetyltransferase) in yeasts. 
The equivalent of SAGA in humans is PCAF (p300/CBP-associated factor). SWI (sw/'tching) was discovered as a protein required for expression 
of certain genes involved in mating-type switching in yeast, and SNF (sucrose nonfermenting) as a factor for expression of the yeast gene for 
sucrase. Subsequent studies revealed multiple SWI and SNF proteins that acted in a complex. The SWI/SNF complex has a role in the 
expression of a wide range of genes and has been found in many eukaryotes, including humans. NURF is nuclear remodeling factor; CAF1, 
chromatin assembly factor; and NAP1, nucleosome assembly protein. 
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bind to DNA so as to preclude access of RNA polymerase 
(negative regulation) would often be simply redundant. 
Other factors are at play in the use of positive regula¬ 
tion, and speculation generally centers around two: the 
large size of eukaryotic genomes and the greater effi¬ 
ciency of positive regulation. 

First, nonspecific DNA binding of regulatory pro¬ 
teins becomes a more important problem in the much 
larger genomes of higher eukaryotes. And the chance 
that a single specific binding sequence will occur ran¬ 
domly at an inappropriate site also increases with 
genome size. Specificity for transcriptional activation 
can be improved if each of several positive-regulatory 
proteins must bind specific DNA sequences and then 
form a complex in order to become active. The average 
number of regulatory sites for a gene in a multicellular 
organism is probably at least five. The requirement for 
binding of several positive-regulatory proteins to spe¬ 
cific DNA sequences vastly reduces the probability of 
the random occurrence of a functional juxtaposition of 
all the necessary binding sites. In principle, a similar 
strategy could be used by multiple negative-regulatory 
elements, but this brings us to the second reason for the 
use of positive regulation: it is simply more efficient. If 
the 30,000 to 35,000 genes in the human genome were 
negatively regulated, each cell would have to synthe¬ 
size, at all times, this same number of different repres¬ 
sors (or many times this number if multiple regulatory 
elements were used at each promoter) in concentra¬ 
tions sufficient to permit specific binding to each “un¬ 
wanted” gene. In positive regulation, most of the genes 
are normally inactive (that is, RNA polymerases do not 
bind to the promoters) and the cell synthesizes only the 
activator proteins needed to promote transcription of 
the subset of genes required in the cell at that time. 
These arguments notwithstanding, there are examples 
of negative regulation in eukaryotes, from yeast to hu¬ 
mans, as we shall see. 

DNA-Binding Transactivators and Coactivators 
Facilitate Assembly of the General 
Transcription Factors 

To continue our exploration of the regulation of gene 
expression in eukaryotes, we return to the interactions 
between promoters and RNA polymerase II (Pol II), the 
enzyme responsible for the synthesis of eukaryotic 
rnRNAs. Although most (but not all) Pol II promoters 
include the TATA box and Inr (initiator) sequences, with 
their standard spacing (see Fig. 26-8), they vary greatly 
in both the number and the location of additional se¬ 
quences required for the regulation of transcription. 
These additional regulatory sequences are usually called 
enhancers in higher eukaryotes and upstream acti¬ 
vator sequences (UASs) in yeast. A typical enhancer 
may be found hundreds or even thousands of base pairs 


upstream from the transcription start site, or may even 
be downstream, within the gene itself. When bound by 
the appropriate regulatory proteins, an enhancer in¬ 
creases transcription at nearby promoters regardless of 
its orientation in the DNA. The UASs of yeast function 
in a similar way, although generally they must be posi¬ 
tioned upstream and within a few hundred base pairs of 
the transcription start site. An average Pol II promoter 
may be affected by a half-dozen regulatory sequences 
of this type, and even more complex promoters are quite 
common. 

Successful binding of active RNA polymerase II 
holoenzyme at one of its promoters usually requires 
the action of other proteins (Fig. 28-27), of three types: 
(1) basal transcription factors (see Fig. 26-9, Table 
26-1), required at every Pol II promoter; (2) DNA- 
binding transactivators, which bind to enhancers or 
UASs and facilitate transcription; and (3) coactivators. 
The latter group act indirectly—not by binding to the 
DNA—and are required for essential communication be¬ 
tween the DNA-binding transactivators and the complex 
composed of Pol II and the general transcription factors. 
Furthermore, a variety of repressor proteins can inter¬ 
fere with communication between the RNA polymerase 
and the DNA-binding transactivators, resulting in re¬ 
pression of transcription (Fig. 28-27b). Here we focus 
on the protein complexes shown in Figure 28-27 and 
on how they interact to activate transcription. 

TATA-Binding Protein The first component to bind in the 
assembly of a preinitiation complex at the TATA box of 
a typical Pol II promoter is the TATA-binding protein 
(TBP). The complete complex includes the basal 
(or general) transcription factors TFIIB, TFIIE, TFIIF, 
TFIIH; Pol II; and perhaps TFIIA (not all of the factors 
are shown in Fig. 28-27). This minimal preinitiation 
complex, however, is often insufficient for the initiation 
of transcription and generally does not form at all if the 
promoter is obscured within chromatin. Positive regu¬ 
lation leading to transcription is imposed by the trans¬ 
activators and coactivators. 

DNA-Binding Transactivators The requirements for trans¬ 
activators vary greatly from one promoter to another. A 
few transactivators are known to facilitate transcription 
at hundreds of promoters, whereas others are specific 
for a few promoters. Many transactivators are sensitive 
to the binding of signal molecules, providing the capac¬ 
ity to activate or deactivate transcription in response to 
a changing cellular environment. Some enhancers bound 
by DNA-binding transactivators are quite distant from 
the promoter’s TATA box. How do the transactivators 
function at a distance? The answer in most cases seems 
to be that, as indicated earlier, the intervening DNA is 
looped so that the various protein complexes can inter¬ 
act directly. The looping is promoted by certain non- 
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(a) 



FIGURE 28-27 Eukaryotic promoters and regulatory proteins. RNA 

polymerase II and its associated general transcription factors form a 
preinitiation complex at the TATA box and Inr site of the cognate pro¬ 
moters, a process facilitated by DNA-binding transactivators, acting 
through TFIID and/or mediator, (a) A composite promoter with typi¬ 
cal sequence elements and protein complexes found in both yeast and 
higher eukaryotes. The carboxyl-terminal domain (CTD) of Pol II (see 
Fig. 26-9) is an important point of interaction with mediator and other 
protein complexes. Not shown are the protein complexes required for 
histone acetylation and chromatin remodeling. For the DNA-binding 
transactivators, DNA-binding domains are shown in green, activation 
domains in pink. The interactions symbolized by blue arrows are dis¬ 
cussed in the text, (b) A wide variety of eukaryotic transcriptional re¬ 
pressors function by a range of mechanisms. Some bind directly to 
DNA, displacing a protein complex required for activation; others in¬ 
teract with various parts of the transcription or activation complexes 
to prevent activation. Possible points of interaction are indicated with 
red arrows. 

histone proteins that are abundant in chromatin and 
bind nonspecifically to DNA. These high mobility group 
(HMG) proteins (Fig. 28-27; “high mobility” refers to 
their electrophoretic mobility in polyacrylamide gels) 
play an important structural role in chromatin remod¬ 
eling and transcriptional activation. 


Coactivator Protein Complexes Most transcription re¬ 
quires the presence of additional protein complexes. 
Some major regulatory protein complexes that interact 
with Pol II have been defined both genetically and bio¬ 
chemically. These coactivator complexes act as inter¬ 
mediaries between the DNA-binding transactivators and 
the Pol II complex. 

The best-characterized coactivator is the transcrip¬ 
tion factor TFIID (Fig. 28-27). In eukaryotes, TFIID is 
a large complex that includes TBP and ten or more TBP- 
associated factors (TAFs). Some TAFs resemble lus- 
tones and may play a role in displacing nucleosomes dur¬ 
ing the activation of transcription. Many DNA-binding 
transactivators aid in transcription initiation by inter¬ 
acting with one or more TAFs. The requirement for 
TAFs to initiate transcription can vary greatly from one 
gene to another. Some promoters require TFIID, some 
do not, and some require only subsets of the TFIID TAF 
subunits. 

Another important coactivator consists of 20 or 
more polypeptides in a protein complex called media¬ 
tor (Fig. 28-27); the 20 core polypeptides are Mghly 
conserved from fungi to humans. Mediator binds tightly 
to the carboxyl-terminal domain (CTD) of the largest 
subumt of Pol II. The mediator complex is required for 
both basal and regulated transcription at promoters 
used by Pol II, and it also stimulates the phosphoryla¬ 
tion of the CTD by TFIIH. Both mediator and TFIID are 
required at some promoters. As with TFIID, some DNA- 
binding transactivators interact with one or more com¬ 
ponents of the mediator complex. Coactivator com¬ 
plexes function at or near the promoter’s TATA box. 

Choreography of Transcriptional Activation We can now be¬ 
gin to piece together the sequence of transcriptional ac¬ 
tivation events at a typical Pol II promoter. First, cru¬ 
cial remodeling of the chromatin takes place in stages. 
Some DNA-binding transactivators have sigmficant 
affinity for their binding sites even when the sites are 
within condensed chromatin. Binding of one transacti¬ 
vator may facilitate the binding of others, gradually dis¬ 
placing some nucleosomes. 

The bound transactivators can then interact di¬ 
rectly with HATs or enzyme complexes such as 
SWI/SNF (or both), accelerating the remodeling of the 
surrounding chromatin. In this way a bound transac¬ 
tivator can draw in other components necessary for 
further chromatin remodeling to permit transcription 
of specific genes. The bound transactivators, gener¬ 
ally acting through complexes such as TFIID or me¬ 
diator (or both), stabilize the binding of Pol II and its 
associated transcription factors and greatly facilitate 
formation of the preinitiation transcription complex. 
Complexity in these regulatory circuits is the rule 
rather than the exception, with multiple DNA-bound 
transactivators promoting transcription. 
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The script can change from one promoter to an¬ 
other, but most promoters seem to require a precisely 
ordered assembly of components to initiate transcrip¬ 
tion. The assembly process is not always fast. At some 
genes it may take minutes; at certain genes in higher 
eukaryotes the process can take days. 

Reversible Transcriptional Activation Although rarer, some 
eukaryotic regulatory proteins that bind to Pol II pro¬ 
moters can act as repressors, inhibiting the formation 
of active preinitiation complexes (Fig. 28-27b). Some 
transactivators can adopt different conformations, en¬ 
abling them to serve as transcriptional activators or re¬ 
pressors. For example, some steroid hormone receptors 
(described later) function in the nucleus as DNA- 
binding transactivators, stimulating the transcription of 
certain genes when a particular steroid hormone signal 
is present. When the hormone is absent, the receptor 
proteins revert to a repressor conformation, prevent¬ 
ing the formation of preinitiation complexes. In some 
cases this repression involves interaction with histone 
deacetylases and other proteins that help restore the 
surrounding chromatin to its transcriptionally inactive 
state. 

The Genes of Galactose Metabolism in Yeast Are 
Subject to Both Positive and Negative Regulation 

Some of the general principles described above can be 
illustrated by one well-studied eukaryotic regulatory 
circuit (Fig. 28-28). The enzymes required for the im¬ 
portation and metabolism of galactose in yeast are en¬ 
coded by genes scattered over several chromosomes 
(Table 28-3). Each of the GAL genes is transcribed sep¬ 
arately, and yeast cells have no operons like those in 
bacteria. However, all the GAL genes have similar pro¬ 
moters and are regulated coordinately by a common set 
of proteins. The promoters for the GAL genes consist 
of the TATA box and Inr sequences, as well as an up¬ 
stream activator sequence (UAS G ) recognized by a 
DNA-binding transcriptional activator known as Gal4 
protein (Gal4p). Regulation of gene expression by galac¬ 
tose entails an interplay between Gal4p and two other 
proteins, Gal80p and Gal3p (Fig. 28-28). Gal80p forms 
a complex with Gal4p, preventing Gal4p from function¬ 
ing as an activator of the GAL promoters. When galac¬ 
tose is present, it binds Gal3p, which then interacts with 
Gal80p, allowing Gal4p to function as an activator at the 
various GAL promoters. 

Other protein complexes also have a role in acti¬ 
vating transcription of the GAL genes. These may in¬ 
clude the SAGA complex for histone acetylation, the 
SWI/SNF complex for nucleosome remodeling, and the 
mediator complex. Figure 28-29 provides an idea of the 
complexity of protein interactions in the overall process 
of transcriptional activation in eukaryotic cells. 


Intermediary complex 
(TFIID or mediator) 


RNA 

polymerase II 
complex 



Gal3p 

+ 

galactose 





OFIGURE 28-28 Regulation of transcription at genes of galactose 
metabolism in yeast. Galactose is imported into the cell and converted 
to galactose 6-phosphate by a pathway involving six enzymes whose 
genes are scattered over three chromosomes (see Table 28-3). Tran¬ 
scription of these genes is regulated by the combined actions of the 
proteins Gal4p, Gal80p, and Gal3p, with Gal4p playing the central 
role of DNA-binding transactivator. The Gal4p-Gal80p complex is in¬ 
active in gene activation. Binding of galactose to Gal3p and its inter¬ 
action with Gal80p produce a conformational change in Gal80p that 
allows Gal4p to function in transcription activation. 


Glucose is the preferred carbon source for yeast, as 
it is for bacteria. When glucose is present, most of the 
GAL genes are repressed—whether galactose is present 
or not. The GAL regulatory system described above is 
effectively overridden by a complex catabolite repres¬ 
sion system that includes several proteins (not depicted 
in Fig. 28-29). 

DNA-Binding Transactivators Have a 
Modular Structure 

DNA-binding transactivators typically have a distinct 
structural domain for specific DNA binding and one or 
more additional domains for transcriptional activation 
or for interaction with other regulatory proteins. Inter¬ 
action of two regulatory proteins is often mediated by 
domains containing leucine zippers (Fig. 28-14) or helix- 
loop-helix motifs (Fig. 28-15). We consider here three 
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TABLE 28-3 Genes of Galactose Metabolism in Yeast 




Chromosomal 

Protein size 

Relative protein expression 
in different carbon sources 


Protein function 

location 

(number of residues) 

Glucose 

Glycerol 

Galactose 

Regulated genes 

GAL1 

Galactokinase 

II 

528 



+ + + 

GAL2 

Galactose permease 

XII 

574 

- 

- 

4- + + 

PGM2 

Phosphoglucomutase 

XIII 

569 

+ 

+ 

+ + 

GAL7 

Galactose 1-phosphate 
uridylyltransferase 

II 

365 

_ 

_ 

+ + + 

GAL10 

UDP-glucose 4-epimerase 

II 

699 

- 

- 

+ + + 

MEL1 

a-Galactosidase 

II 

453 

- 

+ 

+ + 

Regulatory genes 

GAL3 

Inducer 

IV 

520 

_ 

+ 

+ 4- 

GAL4 

Transcriptional activator 

XVI 

881 

+/- 

+ 

+ 

GAL80 

Transcriptional inhibitor 

XIII 

435 

+ 

+ 

+ + 


Source: Adapted from Reece, R. & Platt, A. (1997) Signaling activation and repression of RNA polymerase II transcription in yeast. Bioessays 
19,1001-1010. 


HMG proteins 



FIGURE 28-29 Protein complexes involved in transcription activa¬ 
tion of a group of related eukaryotic genes. The CAL system illustrates 
the complexity of this process, but not all these protein complexes are 
yet known to affect GAL gene transcription. Note that many of the 
complexes (such as SWI/SNF, GCN5-ADA2-ADA3, and mediator) af¬ 
fect the transcription of many genes. The complexes assemble step¬ 
wise. First the DNA-binding transactivators bind, then the additional 
protein complexes needed to remodel the chromatin and allow tran¬ 
scription to begin. 


distinct types of structural domains used in activation 
by DNA-binding transactivators (Fig- 28-30a): Gal4p, 
Spl, and CTF1. 

Gal4p contains a zinc fingerlike structure in its 
DNA-binding domain, near the amino terminus; this do¬ 
main has six Cys residues that coordinate two Zn 2+ . The 
protein functions as a homodimer (with dimerization 
mediated by interactions between two coiled coils) and 
binds to UAS G , a palindromic DNA sequence about 17 bp 
long. Gal4p has a separate activation domain with many 
acidic amino acid residues. Experiments that substitute 
a variety of different peptide sequences for the acidic 
activation domain of Gal4p suggest that the acidic na¬ 
ture of this domain is critical to its function, although 
its precise amino acid sequence can vary considerably. 

Spl (M r 80,000) is a DNA-binding transactivator 
for a large number of genes in higher eukaryotes. Its 
DNA binding site, the GC box (consensus sequence 
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(b) 

FIGURE 28-30 DNA-binding transactivators, (a) Typical DNA-bind- 
ing transactivators such as CTF1, Gal4p, and Spl have a DNA-bind- 
ing domain and an activation domain. The nature of the activation do¬ 
main is indicated by symbols:-, acidic; Q Q Q, glutamine-rich; 

P P P, proline-rich. Some or all of these proteins may activate tran¬ 
scription by interacting with intermediary complexes such asTFIID or 
mediator. Note that the binding sites illustrated here are not generally 
found together near a single gene, (b) A chimeric protein containing 
the DNA-binding domain of Spl and the activation domain of CTF1 
activates transcription if a GC box is present. 


GGGCGG), is usually quite near the TATA box. The 
DNA-binding domain of the Spl protein is near its car¬ 
boxyl terminus and contains three zinc fingers. Two 
other domains in Spl function in activation, and are no¬ 
table in that 25% of their amino acid residues are Gin. 
A wide variety of other activator proteins also have these 
glutamine-rich domains. 

CCAAT-binding transcription factor 1 (CTF1) be¬ 
longs to a family of DNA-binding transactivators that 
bind a sequence called the CCAAT site (its consensus 
sequence is TGGN 6 GCCAA, where N is any nucleotide). 
The DNA-binding domain of CTF1 contains many basic 
amino acid residues, and the binding region is probably 
arranged as an a helix. This protein has neither a helix- 


turn-helix nor a zinc finger motif; its DNA-binding mech¬ 
anism is not yet clear. CTF1 has a proline-rich acti¬ 
vation domain, with Pro accounting for more than 20% 
of the amino acid residues. 

The discrete activation and DNA-binding domains 
of regulatory proteins often act completely independ¬ 
ently, as has been demonstrated in “domain-swapping” 
experiments. Genetic engineering techniques (Chap¬ 
ter 9) can join the proline-rich activation domain of 
CTF1 to the DNA-binding domain of Spl to create a pro¬ 
tein that, like normal Spl, binds to GC boxes on the DNA 
and activates transcription at a nearby promoter (as in 
Fig. 28-30b). The DNA-binding domain of Gal4p has 
similarly been replaced experimentally with the DNA- 
binding domain of the prokaryotic LexA repressor (of 
the SOS response; Fig. 28-22). This chimeric protein 
neither binds at UAS G nor activates the yeast GAL genes 
(as would normal Gal4p) unless the UAS G sequence in 
the DNA is replaced by the LexA recognition site. 

Eukaryotic Gene Expression Can Be Regulated 
by Intercellular and Intracellular Signals 

The effects of steroid hormones (and of thyroid and 
retinoid hormones, which have the same mode of ac¬ 
tion) provide additional well-studied examples of the 
modulation of eukaryotic regulatory proteins by direct 
interaction with molecular signals (see Fig. 12-40). Un¬ 
like other types of hormones, steroid hormones do not 
have to bind to plasma membrane receptors. Instead, 
they can interact with intracellular receptors that are 
themselves transcriptional transactivators. Steroid hor¬ 
mones too hydrophobic to dissolve readily in the blood 
(estrogen, progesterone, and cortisol, for example) 
travel on specific carrier proteins from their point of re¬ 
lease to their target tissues. In the target tissue, the hor¬ 
mone passes through the plasma membrane by simple 
diffusion and binds to its specific receptor protein in the 
nucleus. The hormone-receptor complex acts by bind¬ 
ing to highly specific DNA sequences called hormone 
response elements (HREs), thereby altering gene ex¬ 
pression. Hormone binding triggers changes in the con¬ 
formation of the receptor proteins so that they become 
capable of interacting with additional transcription fac¬ 
tors. The bound hormone-receptor complex can either 
enhance or suppress the expression of adjacent genes. 

The DNA sequences (HREs) to which hormone- 
receptor complexes bind are similar in length and 
arrangement, but differ in sequence, for the various 
steroid hormones. Each receptor has a consensus HRE 
sequence (Table 28-4) to which the hormone-receptor 
complex binds well, with each consensus consisting of 
two six-nucleotide sequences, either contiguous or sep¬ 
arated by three nucleotides, in tandem or in a palindromic 
arrangement. The hormone receptors have a highly 
conserved DNA-binding domain with two zinc fingers 
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TABLE 28-4 Hormone Response Elements (HREs) 
Bound by Steroid-Type Hormone Receptors 


Receptor 

Consensus sequence bound * 

Androgen 

GG(A/T)ACAN 2 TGTTCT 

Glucocorticoid 

GGTACAN3TGTTCT 

Retinoic acid (some) 

AGGTCAN 5 AGGTCA 

Vitamin D 

AGGTCAN3AGGTCA 

Thyroid hormone 

AGGTCAN3AGGTCA 

RX f 

AGGTCANAGGTCANAGGTCANAGGTCA 


*N represents any nucleotide. 

t Forms a dimer with the retinoic acid receptor or vitamin D receptor. 

(Fig. 28-31). The hormone-receptor complex binds to 
the DNA as a dimer, with the zinc finger domains of each 
monomer recognizing one of the six-nucleotide se¬ 
quences. The ability of a given hormone to act through 
the hormone-receptor complex to alter the expression 
of a specific gene depends on the exact sequence of the 
HRE, its position relative to the gene, and the number 
of HREs associated with the gene. 

Unlike the DNA-binding domain, the ligand-binding 
region of the receptor protein—always at the carboxyl 
terminus—is quite specific to the particular receptor. In 
the ligand-binding region, the glucocorticoid receptor is 
only 30% similar to the estrogen receptor and 17% sim¬ 
ilar to the thyroid hormone receptor. The size of the lig¬ 
and-binding region varies dramatically; in the vitamin D 
receptor it has only 25 amino acid residues, whereas in 
the mineralocorticoid receptor it has 603 residues. Mu¬ 
tations that change one amino acid in these regions can 
result in loss of responsiveness to a specific hormone. 


Some humans unable to respond to cortisol, testos¬ 
terone, vitamin D, or thyroxine have mutations of this 
type. 

Regulation Can Result from Phosphorylation 
of Nuclear Transcription Factors 

We noted in Chapter 12 that the effects of insulin on 
gene expression are mediated by a series of steps lead¬ 
ing ultimately to the activation of a protein kinase in the 
nucleus that phosphorylates specific DNA-binding pro¬ 
teins and thereby alters their ability to act as tran¬ 
scription factors (see Fig. 12-6). This general mecha¬ 
nism mediates the effects of many nonsteroid hormones. 
For example, the /3-adrenergic pathway that leads to el¬ 
evated levels of cytosolic cAMP, which acts as a second 
messenger in eukaryotes as well as in prokaryotes (see 
Figs 12-12, 28-18), also affects the transcription of a 
set of genes, each of which is located near a specific 
DNA sequence called a cAMP response element (CRE). 
The catalytic subunit of protein kinase A, released when 
cAMP levels rise (see Fig. 12-15), enters the nucleus 
and phosphorylates a nuclear protein, the CRE-binding 
protein (CREB). When phosphorylated, CREB binds to 
CREs near certain genes and acts as a transcription fac¬ 
tor, turning on the expression of these genes. 

Many Eukaryotic mRNAs Are Subject 
to Translational Repression 

Regulation at the level of translation assumes a much 
more prominent role in eukaryotes than in bacteria and 
is observed in a range of cellular situations. In contrast to 
the tight coupling of transcription and translation in bac¬ 
teria, the transcripts generated in a eukaryotic nucleus 
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FIGURE 28-31 Typical steroid hormone receptors. 

These receptor proteins have a binding site for the 
hormone, a DNA-binding domain, and a region that 
activates transcription of the regulated gene. The highly 
conserved DNA-binding domain has two zinc fingers. 
The sequence shown here is that for the estrogen 
receptor, but the residues in bold type are common to 
all steroid hormone receptors. 
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must be processed and transported to the cytoplasm be¬ 
fore translation. This can impose a significant delay on 
the appearance of a protein. When a rapid increase in 
protein production is needed, a translationally repressed 
mRNA already in the cytoplasm can be activated for 
translation without delay. Translational regulation may 
play an especially important role in regulating certain 
very long eukaryotic genes (a few are measured in the 
millions of base pairs), for which transcription and 
mRNA processing can require many hours. Some genes 
are regulated at both the transcriptional and transla¬ 
tional stages, with the latter playing a role in the fine- 
tuning of cellular protein levels. In some anucleate cells, 
such as reticulocytes (immature erythrocytes), tran¬ 
scriptional control is entirely unavailable and transla¬ 
tional control of stored mRNAs becomes essential. As 
described below, translational controls can also have 
spatial significance during development, when the reg¬ 
ulated translation of prepositioned mRNAs creates a 
local gradient of the protein product. 

Eukaryotes have at least three main mechanisms of 
translational regulation. 

1. Initiation factors are subject to phosphorylation by 
a number of protein kinases. The phosphorylated 
forms are often less active and cause a general 
depression of translation in the cell. 

2. Some proteins bind directly to mRNA and act as 
translational repressors, many of them binding at 
specific sites in the 3' untranslated region 
(3'UTR). So positioned, these proteins interact 
with other translation initiation factors bound to 
the mRNA or with the 40S ribosomal subunit to 
prevent translation initiation (Fig. 28-32; compare 
this with Fig. 27-22). 

3. Binding proteins, present in eukaryotes from yeast 
to mammals, disrupt the interaction between 
eIF4E and eIF4G (see Fig. 27-22). The mammalian 
versions are known as 4E-BPs (eIF4E binding 
proteins). When cell growth is slow, these proteins 
limit translation by binding to the site on eIF4E 
that normally interacts with eIF4G. When cell 
growth resumes or increases in response to 
growth factors or other stimuli, the binding 
proteins are inactivated by protein kinase- 
dependent phosphorylation. 

The variety of translational regulation mechanisms pro¬ 
vides flexibility, allowing focused repression of a few 
mRNAs or global regulation of all cellular translation. 

Translational regulation has been particularly well 
studied in reticulocytes. One such mechanism in these 
cells involves eIF2, the initiation factor that binds to the 
initiator tRNA and conveys it to the ribosome; when 
Met-tRNA has bound to the P site, the factor eIF2B 


40S Ribosomal subunit 



FIGURE 28-32 Translational regulation of eukaryotic mRNA. One of 

the most important mechanisms for translational regulation in eu¬ 
karyotes involves the binding of translational repressors (RNA-binding 
proteins) to specific sites in the 3' untranslated region (3'UTR) of the 
mRNA. These proteins interact with eukaryotic initiation factors or with 
the ribosome (see Fig. 27-22) to prevent or slow translation. 

binds to eIF2, recycling it with the aid of GTP binding 
and hydrolysis. The maturation of reticulocytes includes 
destruction of the cell nucleus, leaving behind a plasma 
membrane packed with hemoglobin. Messenger RNAs 
deposited in the cytoplasm before the loss of the nu¬ 
cleus allow for the replacement of hemoglobin. When 
reticulocytes become deficient in iron or heme, the 
translation of globin mRNAs is repressed. A protein ki¬ 
nase called HCR (/iemin-controlled repressor) is acti¬ 
vated, catalyzing the phosphorylation of eIF2. In its 
phosphorylated form, eIF2 forms a stable complex with 
eIF2B that sequesters the eIF2, making it unavailable 
for participation in translation. In this way, the reticu¬ 
locyte coordinates the synthesis of globin with the avail¬ 
ability of heme. 

Many additional examples of translational regula¬ 
tion have been found in studies of the development of 
multicellular organisms, as discussed in more detail 
below. 

Posttranscriptional Gene Silencing Is Mediated 
by RNA Interference 

In higher eukaryotes, including nematodes, fruit flies, 
plants, and mammals, a class of small RNAs has been 
discovered that mediates the silencing of particular 
genes. The RNAs function by interacting with mRNAs, 
often in the 3'UTR, resulting in either mRNA degrada¬ 
tion or translation inhibition. In either case, the mRNA, 
and thus the gene that produces it, is silenced. This form 
of gene regulation controls developmental timing in at 
least some organisms. It is also used as a mechanism 
to protect against invading RNA viruses (particularly 




























8885d_c28_1081-1119 2/12/04 2:28 PM Page 1111 mac76 mac76:3 


reb: 


28.3 


Regulation of Gene Expression in Eukaryotes 


1111 



important in plants, which lack an immune system) and 
to control the activity of transposons. In addition, small 
RNA molecules may play a critical (but still undefined) 
role in the formation of heterochromatin. 

The small RNAs are sometimes called micro-RNAs 
(miRNAs). Many are present only transiently during 
development, and these are sometimes referred to as 
small temporal RNAs (stRNAs). Hundreds of different 
miRNAs have been identified in higher eukaryotes. They 
are transcribed as precursor RNAs about 70 nucleotides 
long, with internally complementary sequences that 
form hairpinlike structures (Fig. 28-33). The precursors 
are cleaved by endonucleases to form short duplexes 
about 20 to 25 nucleotides long. The best-characterized 
nuclease goes by the delightfully suggestive name Dicer; 
endonucleases in the Dicer family are widely distributed 
in higher eukaryotes. One strand of the processed 
miRNA is transferred to the target mRNA (or to a viral 
or transposon RNA), leading to inhibition of translation 
or degradation of the RNA (Fig. 28-33a). 

This gene regulation mechanism has an interesting 
and very useful practical side. If an investigator intro¬ 
duces into an organism a duplex RNA molecule corre¬ 
sponding in sequence to virtually any mRNA, the Dicer 
endonuclease cleaves the duplex into short segments, 
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FIGURE 28-33 Gene silencing by RNA interference, (a) Small tem¬ 
poral RNAs (stRNAs) are generated by Dicer-mediated cleavage of 
longer precursors that fold to create duplex regions. The stRNAs then 
bind to mRNAs, leading to degradation of mRNA or inhibition of trans¬ 
lation. (b) Double-stranded RNAs can be constructed and introduced 
into a cell. Dicer processes the duplex RNAs into small interfering 
RNAs (siRNAs), which interact with the target mRNA. Again, the mRNA 
is either degraded or its translation inhibited. 


called small interfering RNAs (siRNAs). These bind to 
the mRNA and silence it (Fig. 28-33b). The process is 
known as RNA interference (RNAi). In plants, virtu¬ 
ally any gene can be effectively shut down in this way. 
In nematodes, simply introducing the duplex RNA into 
the worm’s diet produces very effective suppression of 
the target gene. The technique has rapidly become an 
important tool in the ongoing efforts to study gene func¬ 
tion, because it can disrupt gene function without cre¬ 
ating a mutant organism. The procedure can be applied 
to humans as well. Laboratory-produced siRNAs have 
already been used to block HIV and poliovirus infections 
in cultured human cells for a week or so at a time. Al¬ 
though this work is in its infancy, the rapid progress 
makes RNA interference a field to watch for future med¬ 
ical advances. 

Development Is Controlled by Cascades 
of Regulatory Proteins 

For sheer complexity and intricacy of coordination, the 
patterns of gene regulation that bring about develop¬ 
ment of a zygote into a multicellular animal or plant have 
no peer. Development requires transitions in morphol¬ 
ogy and protein composition that depend on tightly co¬ 
ordinated changes in expression of the genome. More 
genes are expressed during early development than in 
any other part of the life cycle. For example, in the sea 
urchin, an oocyte has about 18,500 different mRNAs, 
compared with about 6,000 different mRNAs in the cells 
of a typical differentiated tissue. The mRNAs in the 
oocyte give rise to a cascade of events that regulate the 
expression of many genes across both space and time. 

Several animals have emerged as important model 
systems for the study of development, because they are 
easy to maintain in a laboratory and have relatively short 
generation times. These include nematodes, fruit flies, 
zebra fish, mice, and the plant Arabidopsis. This dis¬ 
cussion focuses on the development of fruit flies. Our 
understanding of the molecular events during develop¬ 
ment of Drosophila melanogaster is particularly well 
advanced and can be used to illustrate patterns and 
principles of general significance. 

The life cycle of the fruit fly includes complete 
metamorphosis during its progression from an embryo 
to an adult (Fig. 28-34). Among the most important 
characteristics of the embryo are its polarity (the an¬ 
terior and posterior parts of the animal are readily dis¬ 
tinguished, as are its dorsal and ventral parts) and its 
metamerism (the embryo body is made up of serially 
repeating segments, each with characteristic features). 
During development, these segments become organized 
into a head, thorax, and abdomen. Each segment of the 
adult thorax has a different set of appendages. Devel¬ 
opment of this complex pattern is under genetic con¬ 
trol, and a variety of pattern-regulating genes have been 
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FIGURE 28-34 Life cycle of the fruit fly Drosophila 

melanogaster. Drosophila undergoes a complete 
metamorphosis, which means that the adult insect is 
radically different in form from its immature stages, a 
transformation that requires extensive alterations 
during development. By the late embryonic stage, 
segments have formed, each containing specialized 
structures from which the various appendages and 
other features of the adult fly will develop. 
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discovered that dramatically affect the organization of 
the body. 

The Drosophila egg, along with 15 nurse cells, is 
surrounded by a layer of follicle cells (Fig. 28-35). As 
the egg cell forms (before fertilization), rnRNAs and pro¬ 
teins originating in the nurse and follicle cells are de¬ 
posited in the egg cell, where some play a critical role 
in development. Once a fertilized egg is laid, its nucleus 
divides and the nuclear descendants continue to divide 
in synchrony every 6 to 10 min. Plasma membranes are 
not formed around the nuclei, which are distributed 
within the egg cytoplasm (or syncytium). Between the 
eighth and eleventh rounds of nuclear division, the nu¬ 
clei migrate to the outer layer of the egg, forming a 
monolayer of nuclei surrounding the common yolk-rich 
cytoplasm; this is the syncytial blastoderm. After a few 
additional divisions, membrane invaginations surround 
the nuclei to create a layer of cells that form the cellu¬ 
lar blastoderm. At this stage, the mitotic cycles in the 
various cells lose their synchrony. The developmental 
fate of the cells is determined by the rnRNAs and pro¬ 
teins originally deposited in the egg by the nurse and 
follicle cells. 

Proteins that, through changes in local concentra¬ 
tion or activity, cause the surrounding tissue to take up 
a particular shape or structure are sometimes referred 
to as morphogens; they are the products of pattern¬ 
regulating genes. As defined by Christiane Niisslein- 
Volhard, Edward B. Lewis, and Eric F. Wieschaus, three 
major classes of pattern-regulating genes—maternal, 
segmentation, and homeotic genes—function in suc¬ 
cessive stages of development to specify the basic fea¬ 


tures of the Drosophila embryo’s body. Maternal 
genes are expressed in the unfertilized egg, and the 
resulting maternal rnRNAs remain dormant until fer¬ 
tilization. These provide most of the proteins needed in 
very early development, until the cellular blastoderm is 
formed. Some of the proteins encoded by maternal 
rnRNAs direct the spatial organization of the develop¬ 
ing embryo at early stages, establishing its polarity. 
Segmentation genes, transcribed after fertilization, 
direct the formation of the proper number of body seg¬ 
ments. At least three subclasses of segmentation genes 
act at successive stages: gap genes divide the devel¬ 
oping embryo into several broad regions, and pair-rule 
genes together with segment polarity genes define 
14 stripes that become the 14 segments of a normal em¬ 
bryo. Homeotic genes are expressed still later; they 
specify which organs and appendages will develop in 
particular body segments. 

The many regulatory genes in these three classes 
direct the development of an adult fly, with a head, tho¬ 
rax, and abdomen, with the proper number of segments, 
and with the correct appendages on each segment. Al¬ 
though embryogenesis takes about a day to complete, 
all these genes are activated during the first four hours. 
Some rnRNAs and proteins are present for only a few 
minutes at specific points during this period. Some of 
the genes code for transcription factors that affect the 
expression of other genes in a kind of developmental 
cascade. Regulation at the level of translation also oc¬ 
curs, and many of the regulatory genes encode transla¬ 
tional repressors, most of which bind to the 3'UTR of 
the mRNA (Fig. 28-32). Because many rnRNAs are 
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deposited in the egg long before their translation is 
required, translational repression provides an especially 
important avenue for regulation in developmental 
pathways. 

Maternal Genes Some maternal genes are expressed 
within the nurse and follicle cells, and some in the egg 
itself. Within the unfertilized Drosophila egg, the mater¬ 
nal gene products establish two axes—anterior-posterior 
and dorsal-ventral—and thus define which regions of the 
radially symmetric egg will develop into the head and ab¬ 
domen and the top and bottom of the adult fly. A key 
event in very early development is establishment of 
mRNA and protein gradients along the body axes. Some 
maternal mRNAs have protein products that diffuse 
through the cytoplasm to create an asymmetric distribu¬ 
tion in the egg. Different cells in the cellular blastoderm 
therefore inherit different amounts of these proteins, 
setting the cells on different developmental paths. The 
products of the maternal mRNAs include transcriptional 
activators or repressors as well as translational rep¬ 
ressors, all regulating the expression of other pattern- 
regulating genes. The resulting specific patterns and 
sequences of gene expression therefore differ between 
cell lineages, ultimately orchestrating the development of 
each adult structure. 

The anterior-posterior axis in Drosophila is defined 
at least in part by the products of the bicoid and nanos 
genes. The bicoid gene product is a major anterior 
morphogen, and the nanos gene product is a major 
posterior morphogen. The mRNA from the bicoid gene 
is synthesized by nurse cells 
and deposited in the unfertil¬ 
ized egg near its anterior pole. 

Ntisslein-Volhard found that 
this mRNA is translated soon 
after fertilization, and the Bi¬ 
coid protein diffuses through 


Christiane Nusslein-Volhard 



FIGURE 28-35 Early development in Drosophila. During develop¬ 
ment of the egg, maternal mRNAs (including the bicoid and nanos 
gene transcripts, discussed in the text) and proteins are deposited in 
the developing oocyte (unfertilized egg cell) by nurse cells and folli¬ 
cle cells. After fertilization, the two nuclei of the fertilized egg divide 
in synchrony within the common cytoplasm (syncytium), then migrate 
to the periphery. Membrane invaginations surround the nuclei to cre¬ 
ate a monolayer of cells at the periphery; this is the cellular blasto¬ 
derm stage. During the early nuclear divisions, several nuclei at the 
far posterior become pole cells, which later become the germ-line 
cells. 
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FIGURE 28-36 Distribution of a maternal gene product in a 

Drosophila egg. (a) Micrograph of an immunologically stained egg, 
showing distribution of the bicoid ( bed) gene product. The graph meas¬ 
ures stain intensity. This distribution is essential for normal develop¬ 


ment of the anterior structures of the animal, (b) If the bed gene is not 
expressed by the mother ( bcd~/bcd~ mutant) and thus no bicoid 
mRNA is deposited in the egg, the resulting embryo has two posteri¬ 
ors (and soon dies). 


the cell to create, by the seventh nuclear division, a 
concentration gradient radiating out from the anterior 
pole (Fig. 28-36a). The Bicoid protein is a transcription 
factor that activates the expression of a number of seg¬ 
mentation genes; the protein contains a homeodomain 
(p. 1090). Bicoid is also a translational repressor that in¬ 
activates certain mRNAs. The amounts of Bicoid protein 
in various parts of the embryo affect the subsequent ex¬ 
pression of a number of other genes in a threshold- 
dependent manner. Genes are transcriptionally activated 
or translationally repressed only where the Bicoid protein 
concentration exceeds the threshold. Changes in the 
shape of the Bicoid concentration gradient have dramatic 
effects on the body pattern. Lack of Bicoid protein results 
in development of an embryo with two abdomens but nei¬ 
ther head nor thorax (Fig. 28—36b); however, embryos 
without Bicoid will develop normally if an adequate 
amount of bicoid mRNA is injected into the egg at the ap¬ 
propriate end. The nanos gene has an analogous role, but 
its mRNA is deposited at the posterior end of the egg and 
the anterior-posterior protein gradient peaks at the pos¬ 
terior pole. The Nanos protein is a translational repressor. 


A broader look at the effects of maternal genes re¬ 
veals the outline of a developmental circuit. In addition 
to the bicoid and nanos mRNAs, which are deposited 
in the egg asymmetrically, a number of other maternal 
mRNAs are deposited uniformly throughout the egg cy¬ 
toplasm. Three of these mRNAs encode the Pumilio, 
Hunchback, and Caudal proteins, all affected by nanos 
and bicoid (Fig. 28-37). Caudal and Pumilio are in¬ 
volved in development of the posterior end of the fly. 
Caudal is a transcriptional activator with a home¬ 
odomain; Pumilio is a translational repressor. Hunch¬ 
back protein plays an important role in the development 
of the anterior end and is also a transcriptional regula¬ 
tor of a variety of genes, in some cases a positive regu¬ 
lator, in other cases negative. Bicoid suppresses trans¬ 
lation of caudal in the anterior and also acts as a 
transcriptional activator of hunchback in the cellular 
blastoderm. Because hunchback is expressed both from 
maternal mRNAs and from genes in the developing egg, 
it is considered both a maternal and a segmentation 
gene. The result of the activities of Bicoid is an increased 
concentration of Hunchback at the anterior end of the 
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FIGURE 28-37 Regulatory circuits of the anterior-posterior axis in 
a Drosophila egg. The bicoid and nanos mRNAs are localized near 
the anterior and posterior poles, respectively. The caudal, hunchback, 
and pumilio mRNAs are distributed throughout the egg cytoplasm. The 
gradients of Bicoid (Bed) and Nanos proteins lead to accumulation of 
Hunchback protein in the anterior and Caudal protein in the poste¬ 
rior of the egg. Because Pumilio protein requires Nanos protein for its 
activity as a translational repressor of hunchback, it functions only at 
the posterior end. 


egg. The Nanos and Pumilio proteins act as translational 
repressors of hunchback , suppressing synthesis of its 
protein near the posterior end of the egg. Pumilio does 
not function in the absence of the Nanos protein, and 
the gradient of Nanos expression confines the activity 
of both proteins to the posterior region. Translational 
repression of the hunchback gene leads to degradation 
of hunchback mRNA near the posterior end. However, 
lack of Bicoid protein in the posterior leads to expres¬ 
sion of caudal. In this way, the Hunchback and Caudal 
proteins become asymmetrically distributed in the egg. 

Segmentation Genes Gap genes, pair-rule genes, and 
segment polarity genes, three subclasses of segmenta¬ 
tion genes in Drosophila, are activated at successive 


stages of embryonic development. Expression of the gap 
genes is generally regulated by the products of one or 
more maternal genes. At least some of the gap genes 
encode transcription factors that affect the expression 
of other segmentation or (later) homeotic genes. 

One well-characterized segmentation gene is fushi 
tarazu (ftz ), of the pair-rule subclass. When ftz is 
deleted, the embryo develops 7 segments instead of the 
normal 14, each segment twice the normal width. The 
Fushi-tarazu protein (Ftz) is a transcriptional activator 
with a homeodomain. The mRNAs and proteins derived 
from the normal ftz gene accumulate in a striking pat¬ 
tern of seven stripes that encircle the posterior two- 
thirds of the embryo (Fig. 28-38). The stripes demar¬ 
cate the positions of segments that develop later; these 
segments are eliminated if ftz function is lost. The Ftz 
protein and a few similar regulatory proteins directly or 
indirectly regulate the expression of vast numbers of 
genes in the continuing developmental cascade. 




FIGURE 28-38 Distribution of the fushi tarazu (ftz) gene product in 
early Drosophila embryos, (a) In the normal embryo, the gene prod¬ 
uct can be detected in seven bands around the circumference of the 
embryo (shown schematically). These bands (b) appear as dark spots 
(generated by a radioactive label) in a cross-sectional autoradiograph 
and (c) demarcate the anterior margins of the segments in the late em¬ 
bryo (marked in red). 
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(a) 

FIGURE 28-39 Effects of mutations in homeotic genes in Drosophila, (a) Normal head, 
(b) Homeotic mutant [antennapedia) in which antennae are replaced by legs, (c) Normal 
body structure, (d) Homeotic mutant ( bithorax ) in which a segment has developed incor¬ 
rectly to produce an extra set of wings. 





Homeotic Genes Loss of homeotic genes by mutation or 
deletion causes the appearance of a normal appendage 
or body structure at an inappropriate body position. An 
important example is the ultrabithorax (ubx) gene. 
When Ubx function is lost, the first abdominal segment 
develops incorrectly, having the structure of the third 
thoracic segment. Other known homeotic mutations 
cause the formation of an extra set of wings, or two legs 
at the position in the head where the antennae are nor¬ 
mally found (Fig. 28-39). 

The homeotic genes often span long regions of DNA. 
The ubx gene, for example, is 77,000 bp long. More than 
73,000 bp of this gene are in introns, one of which is 
more than 50,000 bp long. Transcription of the ubx gene 
takes nearly an hour. The delay this imposes on ubx 
gene expression is believed to be a timing mechanism 
involved in the temporal regulation of subsequent steps 
in development. The Ubx protein is yet another tran¬ 
scriptional activator with a homeodomain (Fig. 28-13). 

Many of the principles of development outlined 
above apply to eukaryotes from nematodes to humans. 
Some of the regulatory proteins themselves are con¬ 
served. For example, the products of the homeobox- 
containing genes HOX 1.1 in mouse and antennapedia 
in fruit fly differ in only one amino acid residue. Of 
course, although the molecular regulatory mechanisms 
may be similar, many of the ultimate developmental 
events are not conserved (humans do not have wings 
or antennae). The discovery of structural determinants 
with identifiable molecular functions is the first step in 
understanding the molecular events underlying devel¬ 
opment. As more genes and their protein products are 
discovered, the biochemical side of this vast puzzle will 
be elucidated in increasingly rich detail. 


SUMMARY 28.3 Regulation of Gene Expression 
in Eukaryotes 


■ In eukaryotes, positive regulation is more 
common than negative regulation, and 
transcription is accompanied by large changes 
in chromatin structure. Promoters for Pol II 
typically have a TATA box and Inr sequence, as 
well as multiple binding sites for DNA-binding 
transactivators. The latter sites, sometimes 
located hundreds or thousands of base pairs 
away from the TATA box, are called upstream 
activator sequences in yeast and enhancers in 
higher eukaryotes. 

■ Large complexes of proteins are generally 
required to regulate transcriptional activity. 

The effects of DNA-binding transactivators on 
Pol II are mediated by coactivator protein 
complexes such as TFIID or mediator. The 
modular structures of the transactivators have 
distinct activation and DNA-binding domains. 
Other protein complexes, including histone 
acetyltransferases such as GCN5-ADA2-ADA3 
and ATP-dependent complexes such as 
SWI/SNF and NURF, reversibly remodel 
chromatin structure. 

■ Hormones affect the regulation of gene 
expression in one of two ways. Steroid 
hormones interact directly with intracellular 
receptors that are DNA-binding regulatory 
proteins; binding of the hormone has either 
positive or negative effects on the transcription 
of genes targeted by the hormone. Nonsteroid 
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hormones bind to cell-surface receptors, 
triggering a signaling pathway that can lead to 
phosphorylation of a regulatory protein, 
affecting its activity. 

■ Development of a multicellular organism 
presents the most complex regulatory 
challenge. The fate of cells in the early embryo 
is determined by establishment of 
anterior-posterior and dorsal-ventral gradients 


of proteins that act as transcriptional 
transactivators or translational repressors, 
regulating the genes required for the 
development of structures appropriate to a 
particular part of the organism. Sets of 
regulatory genes operate in temporal and 
spatial succession, transforming given areas of 
an egg cell into predictable structures in the 
adult organism. 
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Problems 


1. Effect of mRNA and Protein Stability on Regula¬ 
tion E. coli cells are growing in a medium with glucose as 
the sole carbon source. Tryptophan is suddenly added. The 
cells continue to grow, and divide every 30 min. Describe 
(qualitatively) how the amount of tryptophan synthase 
activity in the cells changes with time under the following 
conditions: 

(a) The trp mRNA is stable (degraded slowly over many 
hours). 

(b) The trp mRNA is degraded rapidly, but tryptophan 
synthase is stable. 

(c) The trp mRNA and tryptophan synthase are both 
degraded rapidly. 

2. Negative Regulation Describe the probable effects on 
gene expression in the lac operon of a mutation in (a) the 
lac operator that deletes most of Op (b) the lac I gene that 
inactivates the repressor; and (c) the promoter that alters 
the region around position —10. 


3. Specific DNA Binding by Regulatory Proteins A 

typical prokaryotic repressor protein discriminates between 
its specific DNA binding site (operator) and nonspecific DNA 
by a factor of 10 4 to 10®. About 10 molecules of repressor per 
cell are sufficient to ensure a high level of repression. Assume 
that a very similar repressor existed in a human cell, with a 
similar specificity for its binding site. How many copies of the 
repressor would be required to elicit a level of repression sim¬ 
ilar to that in the prokaryotic cell? (Hint: The E. coli genome 
contains about 4.6 million bp; the human haploid genome has 
about 3.2 billion bp.) 

4. Repressor Concentration in E. coli The dissociation 
constant for a particular repressor-operator complex is very 
low, about 10“ 13 m. An E. coli cell (volume 2 X 10“ 12 ruL) 
contains 10 copies of the repressor. Calculate the cellular con¬ 
centration of the repressor protein. How does this value com¬ 
pare with the dissociation constant of the repressor-operator 
complex? What is the significance of this result? 
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5. Catabolite Repression E. coli cells are growing in a 
medium containing lactose but no glucose. Indicate whether 
each of the following changes or conditions would increase, 
decrease, or not change the expression of the lac operon. It 
may be helpful to draw a model depicting what is happening 
in each situation. 

(a) Addition of a high concentration of glucose 

(b) A mutation that prevents dissociation of the Lac re¬ 
pressor from the operator 

(c) A mutation that completely inactivates j8-galactosi- 

dase 

(d) A mutation that completely inactivates galactoside 
permease 

(e) A mutation that prevents binding of CRP to its bind¬ 
ing site near the lac promoter 

6. Transcription Attenuation How would transcription 
of the E. coli trp operon be affected by the following manip¬ 
ulations of the leader region of the trp mRNA? 

(a) Increasing the distance (number of bases) between 
the leader peptide gene and sequence 2 

(b) Increasing the distance between sequences 2 and 3 

(c) Removing sequence 4 

(d) Changing the two Trp codons in the leader peptide 
gene to His codons 

(e) Eliminating the ribosome-binding site for the gene 
that encodes the leader peptide 

(f) Changing several nucleotides in sequence 3 so that 
it can base-pair with sequence 4 but not with sequence 2 

7. Repressors and Repression How would the SOS re¬ 
sponse in E. coli be affected by a mutation in the lexA gene 
that prevented autocatalytic cleavage of the LexA protein? 

8. Regulation by Recombination In the phase variation 
system of Salmonella, what would happen to the cell if the 
Hin recombinase became more active and promoted re¬ 
combination (DNA inversion) several times in each cell 
generation? 

9. Initiation of Transcription in Eukaryotes A new 

RNA polymerase activity is discovered in crude extracts of 
cells derived from an exotic fungus. The RNA polymerase ini¬ 
tiates transcription only from a single, highly specialized pro¬ 
moter. As the polymerase is purified its activity declines, and 
the purified enzyme is completely inactive unless crude ex¬ 
tract is added to the reaction mixture. Suggest an explana¬ 
tion for these observations. 


10. Functional Domains in Regulatory Proteins A bio¬ 
chemist replaces the DNA-binding domain of the yeast Gal4 
protein with the DNA-binding domain from the Lac repres¬ 
sor, and finds that the engineered protein no longer regulates 
transcription of the GAL genes in yeast. Draw a diagram of 
the different functional domains you would expect to find in 
the Gal4 protein and in the engineered protein. Why does the 
engineered protein no longer regulate transcription of the 
GAL genes? What might be done to the DNA-binding site rec¬ 
ognized by this chimeric protein to make it functional in ac¬ 
tivating transcription of GAL genes? 

11. Inheritance Mechanisms in Development A 

Drosophila egg that is bcd~/bcd~ may develop normally but 
as an adult will not be able to produce viable offspring. 
Explain. 

Biochemistry on the Internet 

12. TATA Binding Protein and the TATA Box To ex¬ 
amine the interactions between transcription factors and 
DNA, go to the Protein Data Bank (www.rcsb.org/pdb) and 
download the PDB file 1TGH. This file models the interac¬ 
tions between a human TATA-binding protein and a segment 
of double-stranded DNA. Use the Noncovalent Bond Finder 
at the Chime Resources website (www.umass.edu/microbio/ 
chime) to examine the roles of hydrogen bonds and hydro- 
phobic interactions involved in the binding of this transcrip¬ 
tion factor to the TATA box. 

Within the Noncovalent Bond Finder program, load the 
PDB file and display the protein in Spacefill mode and the 
DNA in Wireframe mode. 

(a) Which of the base pairs in the DNA form hydrogen 
bonds with the protein? Which of these contribute to the spe¬ 
cific recognition of the TATA box by this protein? (Hydrogen- 
bond length between hydrogen donor and hydrogen accep¬ 
tor ranges from 2.5 to 3.3 A.) 

(b) Which amino acid residues in the protein interact 
with these base pairs? On what basis did you make this de¬ 
termination? Do these observations agree with the informa¬ 
tion presented in the text? 

(c) What is the sequence of the DNA in this model and 
which portions of the sequence are recognized by the TATA- 
binding protein? 

(d) Can you identify any hydrophobic interactions in 
this complex? (Hydrophobic interactions usually occur with 
interatomic distances of 3.3 to 4.0 A.) 



















